잡음하에서 이득 적응을 가지는 비정상상태 자기회귀 은닉 마코프 모델에 의한 오염된 음성을 위한 인식

Recognition for Noisy Speech by a Nonstationary AR HMM with Gain Adaptation Under Unknown Noise

  • 이기용 (숭실대학교 정보통신전자공학부) ;
  • 서창우 (숭실대학교 정보통신전자공학부) ;
  • 이주헌 (동아방송대학 인터넷방송과)
  • 발행 : 2002.01.01

초록

본 논문에서는 부가 잡음에 오염된 음성신호에 이득 적응을 가지는 음성인식을 시간 영역에서 다루었다. 잡음은 유색잡음이라고 가정한다. 전화망에서 마찰음 (fricative), 운음 (glides), 유음 (liquds), 그리고 천이영역(transition region)과 같은 음성 신호의 뚜렷한 비정상상태를 극복하기 위해서 NAR-HMM (nonstationary autoregressive HMM)7을 제안하였다. 비정상상태 AR 처리는 M개의 알고 있는 기저 함수 (basis function)의 선형 결합으로 이루어진 다항 함수 (polynomial function)로 나타낼 수 있다. 오염된 신호만을 이용할 수 있을 때, 잡음의 추정 (estimation)문제는 필연적으로 발생한다. 다중 Kalman 필터를 사용함으로써, 잡음모델의 추정과 음성의 이득곡선 (gain contour)을 수행하였다. 제안한방법의 잡음 추정은 오염된 신호로부터 효과적으로 잡음을 제거하여 깨끗한 음성신호를 얻을 수 있었다. 또한 잡음 추정을 하는 일반적인 ARHMM보다 제안한 NAR-HMM이 약 2-3%의 인식성능을 향상시켰다.

In this paper, a gain-adapted speech recognition method in noise is developed in the time domain. Noise is assumed to be colored. To cope with the notable nonstationary nature of speech signals such as fricative, glides, liquids, and transition region between phones, the nonstationary autoregressive (NAR) hidden Markov model (HMM) is used. The nonstationary AR process is represented by using polynomial functions with a linear combination of M known basis functions. When only noisy signals are available, the estimation problem of noise inevitably arises. By using multiple Kalman filters, the estimation of noise model and gain contour of speech is performed. Noise estimation of the proposed method can eliminate noise from noisy speech to get an enhanced speech signal. Compared to the conventional ARHMM with noise estimation, our proposed NAR-HMM with noise estimation improves the recognition performance about 2-3%.

키워드

참고문헌

  1. Fundamentals and applications Robustness in automatic speech recognition J. C. Junqua;J. P. Haton
  2. IEEE Trans. Acoust.,Speech, Signal Processing v.33 Mixture autoregressive hidden Markov models for speech signals B. Juang;L. R. Rabiner https://doi.org/10.1109/TASSP.1985.1164727
  3. IEEE Trans. Signal Processing v.40 no.6 Gain adpted hidden Markov models for recognition of clean and noisy speech Y. Ephraim https://doi.org/10.1109/78.139237
  4. Signal Processing v.27 A Generalized hidden Markov model with state-conditioned trend functions of time for speech signal L. Deng https://doi.org/10.1016/0165-1684(92)90112-A
  5. IEEE Trans. Speech and Audio Processing v.2 no.4 Speech recognition using HMM with polynomial regression functions as nonstationary states L. Deng;M. Aksmanovic;X. Sun;C. F. JeffWu https://doi.org/10.1109/89.326610
  6. Proc. ICSLP '98 v.2 A nonstationary autoregressive HMM with gain adptation for speech recognition K. Y. Lee;J. Lee
  7. IEEE Trans. on Speech and Audio Processing v.1 no.4 A stochastic model of speech incorporating hierarchical nonstationarity L. Deng https://doi.org/10.1109/89.242494
  8. Computer Speech and Language v.9 no.1 A Markov model containing state-conditioned second-order nonstationary: Application to speech recognition L. Deng;C. Rathinavalu https://doi.org/10.1006/csla.1995.0004
  9. IEEE Trans. on Speech and Audio Processing v.2 no.1 Waveform-based Speech recogniton using hidden filter model: Parameter selection and sensitivity to power normalization H. Sheikhzadeh;L. Deng https://doi.org/10.1109/89.260337
  10. Technical Report CUED/F-INFENG/TR135 PMC for speech recognition in additive and convolutional noise M. J. F. Gales;S. J. Young
  11. IEEE Trans. Acoust., Speech, Signal Processing v.31 no.4 Time-Dependent ARMA Modelling of Nonstationary Signals Y. Grenier https://doi.org/10.1109/TASSP.1983.1164152
  12. J. Royal Stat. Soc. B v.39 no.1 Maximum likelihood from incomplete data via the EM Algorithm A. P. Dempster;N. M. Laird;D. B. Rubin
  13. Ann. Math. Stat. v.41 A Maximization Technique in the statistical analysis of probabilstic functions of Markov chains L. E. Baum;T. Petrie;G. Soules;N. Weiss https://doi.org/10.1214/aoms/1177697196
  14. Proc. Eurospeech '97 v.4 A nonstationary autoregressive HMM and its application to speech enhancement K. Y. Lee;J. Rheem
  15. IEEE Trans. on Signal Processing v.39 no.8 Filtering of Colored Noise for Speech Enhancement and Coding J. D. Gibson;B. Koo;S. D. Gray https://doi.org/10.1109/78.91144
  16. IEEE Trans. on Circuits and Systems v.45 no.8 Subband Kalman filtering for speech enhancement W. Wu;P. Chen https://doi.org/10.1109/82.718814
  17. IEEE Trans. on Speech and Audio Processing v.5 no.2 Eehancement of connected words in an extremely noisy environment Y. Cohen;A. Erell;Y. Bistritz
  18. IEEE Trans. on Speech and Audio Processing On the application of the interacting multiple model algorithm for enhancing noisy speech J. B. Kim;K. Y. Lee;C. W. Lee