Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments

비정상 잡음환경에서 음질향상을 위한 적응 임계 치 알고리즘

  • 이수정 (광운대학교 음성신호처리연구실) ;
  • 김순협 (광운대학교 컴퓨터공학과)
  • Published : 2008.10.31

Abstract

This paper proposes a new approach for speech enhancement in highly nonstationary noisy environments. The spectral subtraction (SS) is a well known technique for speech enhancement in stationary noisy environments. However, in real world, noise is mostly nonstationary. The proposed method uses an auto control parameter for an adaptive threshold to work well in highly nonstationary noisy environments. Especially, the auto control parameter is affected by a linear function associated with an a posteriori signal to noise ratio (SNR) according to the increase or the decrease of the noise level. The proposed algorithm is combined with spectral subtraction (SS) using a hangover scheme (HO) for speech enhancement. The performances of the proposed method are evaluated ITU-T P.835 signal distortion (SIG) and the segment signal to-noise ratio (SNR) in various and highly nonstationary noisy environments and is superior to that of conventional spectral subtraction (SS) using a hangover (HO) and SS using a minimum statistics (MS) methods.

본 논문에서는 비정상 잡음환경에서 음질향상을 위한 새로운 방법을 제안한다. 정상 잡음환경에서 음질향상을 위한 잡음제거 방법으로 주파수 차감법이 잘 알려져 있다. 그러나 실제 잡음환경은 대 부분 비정상적인 특성을 나타낸다. 제안한 방법은 다양한 잡음 과 비정상 환경에서 잘 동작 할 수 있도록 적응 임계 치를 위한 자동제어 파라미터를 사용한다. 특히, 자동제어 파라미터는 a posteriori SNR을 이용한 선형함수를 적용하여 잡음레벨의 증감에 따라 적응 임계 치를 제어한다. 제안한 알고리즘은 음질향상을 위해 Hangover (HO)을 이용한 주파수 차감법과 결합한다. 알고리즘의 성능은 다양한 잡음환경에서 ITU-T P.835 signal distortion (SIG)와 segment signal to-noise ratio (SNR)로 평가하여 (HO)을 이용한 음성검출과 minimum statistics (MS) 방법에 비해 우수한 결과를 나타냈다

Keywords

References

  1. M. Berouti, M, Scrwartz, J. Makhoul, "Enhancement of speech corrupted by acoustic noise", Proc. IEEE int. Conf.Acoust. Speech Signal Processing, 208-211, 1979
  2. A. Davis and S. Nordholm, "A Low Complexity Statistical Voice Activity Detector with Performance Comparisons to ITU-T/ ETSI Voice Activity Detectors, Proc. IEEE Int. Conf. Information Communi. & Signal Process., 119-123. 2003
  3. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics", IEEE Trans. Speech Audio Processing, 9(5), 504-512, 2001 https://doi.org/10.1109/89.928915
  4. I. Cohen, "Noise Spectrum in Adverse Enviroments: Improved Minima Controlled Recursive Averaging", IEEE Trans. Speech and Audio Proc., 11(5), 466-475, 2003
  5. L. LIN, W.H. HOLMES, and E. AMBIKAIRAJAH, "Adaptive noiseestimation algorithm for speech enhancement", ELECTRONICS LETTERS, 39(9), 754-755, 2003 https://doi.org/10.1049/el:20030480
  6. R. SUNDARRAJAN, C.L. PHILIPOS, "A noise-estimation algorithm for highly non-stationary environment", SPEECH COMMUNICATION., 48, 220-231, 2006 https://doi.org/10.1016/j.specom.2005.08.005
  7. S. J. Lee and S. H. Kim, "Noise Suppression Using Normalized Time-Frequency Bin Average and Modified Gain Function for Speech Enhancement in Nonstationary Noisy Environments", THE JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 27(1E), 1-10, 2008
  8. 박윤식, 장준혁, "강인한 음성향상을 위한 Minimum Statistics 와 Soft Decision의 확률적 결합의 새로운 잡음전력", 한국음향 학회지, 26(4), 153-158, 2007
  9. ITU-T, "Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm", ITU-T Recommendation P.835, 2003
  10. C. L. PHILOPS, SPEECH ENHANCEMENT (Theory and Practice), 1st edition. (CRC Press, Boca Raton, FL, 2007), pp.495-498