Self-Adaptation Algorithm Based on Maximum A Posteriori Eigenvoice for Korean Connected Digit Recognition

한국어 연결 숫자음 인식을 일한 최대 사후 Eigenvoice에 근거한 자기적응 기법

  • 김동국 (전남대학교 전자컴퓨터정보통신공학부) ;
  • 전형배 (한국전자통신연구원)
  • Published : 2004.11.01

Abstract

This paper Presents a new self-adaptation algorithm based on maximum a posteriori (MAP) eigenvoice for Korean connected digit recognition. The proposed MAP eigenvoice is developed by introducing a probability density model for the eigenvoice coefficients. The Proposed approach provides a unified framework that incorporates the Prior model into the conventional eigenvoice estimation. In self-adaptation system we use only one adaptation utterance that will be recognized, we use MAP eigenvoice that is most robust adaptation. In series of self-adaptation experiments on the Korean connected digit recognition task. we demonstrate that the performance of the proposed approach is better than that of the conventional eigenvoice algorithm for a small amount of adaptation data.

본 논문에서는 한국어 연결 숫자음 인식을 위한 최대 사후 eigenvoice을 사용한 자기적응 기법을 제안한다. 제안된 최대 사후 eigenvoice 기법은 eigenvoice 계수에 대한 확률 밀도 함수를 가정함으로 구성된다. 제안된 알고리즘은 기존 eigenvoice 추정 과정에 선 분포 모델을 포함하는 일반적인 해를 제공하는 구조를 갖는다. 인식할 한 문장만을 사용하는 자기 적응 시스템을 위해 매우 강인한 특성을 갖는 최대 사후 eigenvoice 적응 기법을 사용하였다. 한국어 연결 숫자음에 대한 일련의 자기 적응 실험결과 제안된 알고리즘의 성능은 매우 적은 량의 적응 데이터에 대해 기존 eigenvoice 알고리즘에 비해 우수한 성능을 나타냈었다.

Keywords

References

  1. P. C. Woodland. 'Speaker adaPtation for continuous density HMMs; a review,' in Proc. AdaPtation Methods for Speech Recognition, ISCA ITR-Workshop, Sophia-Antipolis, France, pp. 11-19, 2001
  2. R. Kuhn, F. Perronnin and J. -C. Junqua, 'Time is money: why very rapid adaPtation matters,' in Proc. Adaptation Methods for Speech Recognition, ISCA ITR-Workshop, Sophia-Antipolis, France, 33-36, 2001
  3. J .L. Gauvain and C. -H. Lee, 'Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,' IEEE Trans. Speech and Audio Proc., 2, 291-298, 1994 https://doi.org/10.1109/89.279278
  4. C. J. Leggetter and P. C. Woodland, 'Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models.' Computer Speech and Language, 9, 171-185, 1995 https://doi.org/10.1006/csla.1995.0010
  5. R. Kuhn, J. -C. Junqua, P. Nguyen and N. Niedzielski, 'Rapid speaker adaptation in Eigenvoice Space,' IEEE Trans. Speech and Audio Proc.. 8(6), 695-707, 2000 https://doi.org/10.1109/89.876308
  6. Ho-Young Jung, Mansoo Park, Hoi-Rin Kim, and Minsoo Hahn, 'Speaker Adaptation Using ICA-Based Feature Transfomation,' ETRI J., 24(6), pp,469-472, Dec. 2002 https://doi.org/10.4218/etrij.02.0202.0003
  7. W Chou, 'Maximum a posteriori linear regression with elliPtically symmetric matrix priors,' in Proc. Euro. Conf. Speech Commun., Technology, 1, 1-4, 1999
  8. D. K. Kim and N. S. Kim, 'Rapid speaker adaptation using probabilistic principal component analysis,' IEEE Signal Processing Letters. 8(6), 180-183, June 2001 https://doi.org/10.1109/97.923045
  9. D. K. Kim, Y. J. Kim, W H. Lim, and N. S. Kim, 'Online adaptation using transformation space model evolution,' in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2003
  10. I. T. Jolliffe, Principal Component Analysis. Springer-Verlag, 1986
  11. K. -T. Chen, W -W Liau, H. -M. Wang, and L. -So Lee, 'Fast speaker adaptation using eigenspace-based maximum likelihood linear regression,' in Proc. Int. Conf. Spoken Language Processing, Beijing, China, 742-745, Oct. 2000
  12. 전형배, 김동국, '연결 숫자음 인식에서의 고속 화자 적응', 제 20회 음성통신 및 신호처리 학술대회 논문집, pp 441-444, 2003
  13. P. Nguyen, 'Speaker adaptation: Modeling variabilities,' Ph.D. thesis, 2002