DOI QR코드

DOI QR Code

On-line noise coherence estimation algorithm for binaural speech enhancement system

양이형 음성 음질개선 시스템을 위한 온라인 잡음 상관도 추정 알고리즘

Ji, Youna;Baek, Yong-hyun;Park, Young-cheol
지유나;백용현;박영철

  • Received : 2016.03.03
  • Accepted : 2016.04.19
  • Published : 2016.05.31

Abstract

In this paper, an on-line noise coherence estimation algorithm for binaural speech enhancement system is proposed. A number of noise Power Spectral Density (PSD) estimation algorithms based on the noise coherence between two microphones have been proposed to improve the speech enhancement performance. In the conventional algorithms, the noise coherence was characterized using a real-valued analytic model. However, unlike the analytic model, the noise coherence between the two microphones is time-varying in real environments. Thus, in this paper, the noise coherence is updated in accordance with the variation of the acoustic environment to track the realistic noise coherence. The noise coherence can be updated only during the absence of speech, and the simulation results demonstrate the superiority of the proposed algorithm over the conventional algorithms based on the analytic model.

Keywords

Noise coherence;Diffuse noise field;Binaural speech enhancement;Noise PSD (Power Spectral Density) estimator

References

  1. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Process. 9, 504-512 (2001). https://doi.org/10.1109/89.928915
  2. Y. H. Son, J. H. Choi and J. H. Chang, "Improved minimum statistics based on environment -awareness for noise power estimation" (in Korean), J. Acoust. Soc. Kr. 30, 123-128 (2011). https://doi.org/10.7776/ASK.2011.30.3.123
  3. S. J. Lee and S. H. Kim, "Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments" (in Korean), J. Acoust. Soc. Kr. 27, 386-393 (2008).
  4. S. Rangachari and P. C. Loizou, "A noise-estimation algorithm for highly non-stationary environments," Speech communication, 48, 220-231 (2006). https://doi.org/10.1016/j.specom.2005.08.005
  5. L. Wang, T. Gerkmann and S. Doclo. "Noise power spectral density estimation using MaxNSR blocking matrix," IEEE/ACM Trans. Audio, Speech, Lang. Process. 23, 1493-1508 (2015). https://doi.org/10.1109/TASLP.2015.2438542
  6. H. Abutalebi, H. Sheikhzadeh and L. Brennan, "A hybrid subband adaptive system for speech enhancement in diffuse noise fields," IEEE Signal Process. Lett. 11, 44-47, (2004). https://doi.org/10.1109/LSP.2003.819348
  7. B. N. Laska, M. Bolic and R. A. Goubran, "Coherence-assisted Wiener filter binaural speech enhancement," IEEE, Instrumentation and Measurement Technology Conference, 876-881, (2010).
  8. I. A. McCowan and H. Bourlard, "Microphone array post-filter based on noise field coherence," IEEE Trans. on. Speech, Audio Process. 11, 709-716 (2003). https://doi.org/10.1109/TSA.2003.818212
  9. A. H. Kamkar-Parsi and M. Bouchard, "Improved noise power spectrum density estimation for binaural hearing aids operating in a diffuse noise field environment," IEEE Trans. on Audio, Speech, and Lang. Process. 17, 521-533 (2009). https://doi.org/10.1109/TASL.2008.2009017
  10. M. Jeub, C. Nelke, H. Kruger, C. Beaugeant and P. Vary, "Robust dual-channel noise power spectral density estimation," Signal Processing Conference, 2011 19th European. IEEE, 2304-2308 (2011).
  11. Y. Ji, Y. C. Park, D. W. Kim, and J. Shon, "Robust noise PSD estimation for binaural hearing aids in time-varying diffuse noise field," in IEEE ICASSP, 7264-7268 (2013).
  12. I. Lindevald and A. Benade, "Two-ear correlation in the statistical sound fields of rooms," J. Acoust. Soc. Am. 80, 661-664 (1986). https://doi.org/10.1121/1.394061
  13. M. Jeub and P. Vary, "Binaural dereverberation based on a dual-channel Wiener filter with optimized noise field coherence." in IEEE ICASSP, 4710-4713 (2010).
  14. A. V. Ralph, A. Carlos and D. Richard O, "Elevation localization and head-related transfer function analysis at low frequencies," J. Acoust. Soc. Am. 109, 1110-1122 (2001). https://doi.org/10.1121/1.1349185
  15. I. Cohen, "Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator," IEEE Signal Processing Letters. 9, 113-116 (2002). https://doi.org/10.1109/97.1001645
  16. W. G. Gardner and K. Martin, HRTF measurements of a KEMAR dummy-head microphone (Technical Report 280, MIT Media Lab Perceptual Computing, 1994).
  17. Y. Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. Audio Speech and Lang. Process. 16, 229-238 (2008). https://doi.org/10.1109/TASL.2007.911054

Acknowledgement

Supported by : 한국산업기술진흥원