DOI QR코드

DOI QR Code

코드북 기반 음성향상 기법을 위한 게인 보상 방법

Gain Compensation Method for Codebook-Based Speech Enhancement

  • 정승모 (세종대학교 정보통신공학과) ;
  • 김무영 (세종대학교 정보통신공학과)
  • Jung, Seungmo (Department of Information and Communication Engineering, Sejong University) ;
  • Kim, Moo Young (Department of Information and Communication Engineering, Sejong University)
  • 투고 : 2014.06.23
  • 심사 : 2014.08.26
  • 발행 : 2014.09.25

초록

음성 인식을 위한 전처리기로 주변 잡음을 제거해 주는 음성향상 기법이 강조되고 있다. 다양한 음성향상 기법들 중 코드북 기반 음성향상 기법은 nonstationary 잡음 환경에서도 효율적으로 동작한다. 하지만, 기존 코드북 기반 음성향상 기법에서는 입력 신호와 음성 및 잡음 코드벡터 간에 미스매치가 발생하여 부정확한 게인이 추정되는 문제가 있다. 본 논문에서는 부정확한 게인을 보상하기 위해 long-term 잡음 추정 알고리즘을 사용하여 매 프레임 별로 신호 대 잡음비기반의 Normalized Weighting Factor (NWF)를 구하고, 이것을 기존 게인에 보상하는 방식을 제안한다. 제안된 코드북 기반 음성향상 기법은 기존 코드북 기반 음성향상 기법에 비해 향상된 성능을 보였다.

Speech enhancement techniques that remove surrounding noise are stressed to preprocessor of speech recognition. Among the various speech enhancement techniques, Codebook-based Speech Enhancement (CBSE) operates efficiently in non-stationary noise environments. But, CBSE has some problems that inaccurate gains can be estimated if mismatch occur between input noisy signal and trained speech/noise codevectors. In this paper, the Normalized Weighting Factor (NWF) is calculated by long-term noise estimation algorithm based on Signal-to-Noise Ratio, compensated to the conventional inaccurate gains. The proposed CBSE shows better performance than conventional CBSE.

키워드

참고문헌

  1. R. M. Udrea, N. D. Vizireanu, and S. Ciochina, "An improved spectral subtraction method for speech enhancement using a perceptual weighting filter," Elsevier Digital Signal Process., vol. 18, pp. 581-587, 2008. https://doi.org/10.1016/j.dsp.2007.08.002
  2. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Process., vol. 9, pp. 504-512, 2001. https://doi.org/10.1109/89.928915
  3. I. Cohen, "Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement," IEEE Signal Process. Lett, vol. 9, pp. 12-15, 2002. https://doi.org/10.1109/97.988717
  4. I. Cohen, "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Trans. Speech Audio Process., vol. 11, pp. 466-475, 2003 https://doi.org/10.1109/TSA.2003.811544
  5. S. Srinivasan, J. Samuelsson, and W. B. Kleijn, "Codebook driven short-term predictor parameter estimation for speech enhancement," IEEE Trans. Speech Audio Process., vol. 14, pp.163-176, 2006. https://doi.org/10.1109/TSA.2005.854113
  6. S. Srinivasan, J. Samuelsson, and W. B. Kleijn, "Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 441-452, 2007. https://doi.org/10.1109/TASL.2006.881696
  7. S. Jung and M. Y. Kim, "Codebook-based Speech Enhancement Using an SNR Weighting Factor," The Institute of Electronics Engineers of Korea, pp. 495-496, 2013.
  8. P. Loizou, Speech Enhancement: Theory and Practice. CRC Press, 2007.
  9. W. B. Kleijn, T. Backstrom, and P. Alku, "On line spectral frequencies," IEEE Signal Process. Lett., vol. 10, pp. 75-77, 2003 https://doi.org/10.1109/LSP.2003.809035
  10. F. Merazka, "VQ Codebook Design Using Genetic Algorithms for Speech Line Spectral Frequencies," Springer Computational Intelligence and Intelligent Systems, pp. 557-566, 2012.
  11. A. D. Subramaniam and B. D. Rao, "PDF optimized parametric vector quantization of speech line spectral frequencies," IEEE Trans. Speech Audio Process., vol. 11, pp. 130-142, 2003 https://doi.org/10.1109/TSA.2003.809192
  12. J. Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, and N. Dahlgren, "DARPA TIMIT acoustic phonetic continuous speech corpus," 1993, CDROM.
  13. A. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, "The Noisex-92 Study on the Effect of Additive Noise on Automatic Speech Recognition," Technical Report. Malvern, U.K.: DRA Speech Res. Unit, 1992.