잡음음성인식을 위한 데이터 기반의 Jacobian 적응방식

A Data-Driven Jacobian Adaptation Method for the Noisy Speech Recognition

  • 정용주 (계명대학교 전자공학과)
  • 발행 : 2006.05.01

초록

본 논문에서는 잡음음성인식을 위한 데이터 기반의 향상된 Jacobian 적응 방식을 제안하였다. Jacobian 적응에서 필요로 하는 기준 HMM을 구성하기 위해서 기존에 주로 사용되던 모델결합 방식을 사용하는 대신에 잡음음성을 이용하여 직접 훈련하는 방식을 제안하였다. 이렇게 함으로서 기존의 방법에 비해서 잡음에 의한 음향모델의 변이를 보다 잘 처리할 수 있을 것으로 생각된다 제안된 방법에서는 Jacobian 행렬의 추정을 위해서 훈련과정에서 Baum-Welch 알고리듬을 사용하였다. 잡음음성에 대한 인식실험을 통해서 제안된 방식이 기존의 Jacobian 적응 방식 뿐 만 아니라 다른 형태의 모델적응 방식들에 비해서도 우수한 성능을 보임을 알 수 있었다.

In this paper a data-driven method to improve the performance of the Jacobian adaptation (JA) for the noisy speech recognition is proposed. In stead of constructing the reference HMM by using the model composition method like the parallel model combination (PMC), we propose to train the reference HMM directly with the noisy speech. This was motivated from the idea that the directly trained reference HMM will model the acoustical variations due to the noise better than the composite HMM. For the estimation of the Jacobian matrices, the Baum-Welch algorithm is employed during the training. The recognition experiments have been done to show the improved performance of the proposed method over the Jacobian adaptation as well as other model compensation methods.

키워드

참고문헌

  1. Gales, M.J.F., Model based techniques for robust-speech recognition, (Ph. D. Dissertation, University of Cambridge, 1995)
  2. Moreno, P.J., Speech recognition in noisy environments, (Ph. D., Dissertation, Carnegie Mellon University 1996)
  3. Martin, F., Shikano, K. and Minami, Y., 'Recognition of noisy speech by composition of hidden Markov models' In proc. Eurospeech 93, 031-1034, 1993
  4. Sagayama, S., Yamaguchi, Y. and Takahashi, S., 'Jacobian adaptation of noisy speech models', IEEE Workshop on Automatic Speech Recognition and Understanding, Dec 1997, 396-403
  5. Hung, J.W., Shen, J.L. and Lee, L.S., 'New approaches for domain transformation and parameter combination for improved accuracy in parallel model combination (PMC) techniques' IEEE Trans. Speech and Audio Processing, 2001, 9(8), 842-855 https://doi.org/10.1109/89.966087
  6. Moreno, P.J., Raj, B. and Stern, R.M., 'Multivariate Gaussian-Based Cepstral normalization', In Proc. ICASSP 95, 1995
  7. Chung, Y.J., 'A data-driven approach for the model parameter compensation in noisy speech recognition,' In proc. Interspeech 2005, Lisboa, 2005, 961-964
  8. Baum, L.E., Petrie, G.S.T. and Weiss, N. 'A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains', Ann. Math. Statist. 41, 164-171, 1970 https://doi.org/10.1214/aoms/1177697196