DOI QR코드

DOI QR Code

A Closed-Form Solution of Linear Spectral Transformation for Robust Speech Recognition

  • Kim, Dong-Hyun (Speech Information Processing Laboratory, Department of Computer and Communication Engineering, Korea University) ;
  • Yook, Dong-Suk (Speech Information Processing Laboratory, Department of Computer and Communication Engineering, Korea University)
  • 투고 : 2009.01.12
  • 심사 : 2009.04.21
  • 발행 : 2009.08.30

초록

The maximum likelihood linear spectral transformation (ML-LST) using a numerical iteration method has been previously proposed for robust speech recognition. The numerical iteration method is not appropriate for real-time applications due to its computational complexity. In order to reduce the computational cost, the objective function of the ML-LST is approximated and a closed-form solution is proposed in this paper. It is shown experimentally that the proposed closed-form solution for the ML-LST can provide rapid speaker and environment adaptation for robust speech recognition.

키워드

참고문헌

  1. D. Kim and D. Yook, “Fast Channel Adaptation for Continuous Density HMMs Using Maximum Likelihood Spectral Transform,” IEE Electron. Lett., vol. 40, no. 10, 2004, pp. 632-633. https://doi.org/10.1049/el:20040395
  2. D. Kim and D. Yook, “Robust Model Adaptation Using Mean and Variance Transformation in Linear Spectral Domain,” Lecture Notes in Computer Science, vol. 3578, 2005, pp. 149-154.
  3. C. Leggetter and P. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models,” Computer Speech and Language, vol. 9, 1995, pp. 171-185. https://doi.org/10.1006/csla.1995.0010
  4. M. Gales, Model-Based Techniques for Noise Robust Speech Recognition, Ph.D. Thesis, Cambridge University, 1995.
  5. D. Kim and D. Yook, “Linear Spectral Transformation for Robust Speech Recognition Using Maximum Mutual Information,” IEEE Signal Process. Lett., vol. 14, no. 7, 2007, pp. 496-499. https://doi.org/10.1109/LSP.2006.891337

피인용 문헌

  1. Three-Stage Framework for Unsupervised Acoustic Modeling Using Untranscribed Spoken Content vol.32, pp.5, 2009, https://doi.org/10.4218/etrij.10.1510.0092