DOI QR코드

DOI QR Code

Improved Automatic Lipreading by Multiobjective Optimization of Hidden Markov Models

은닉 마르코프 모델의 다목적함수 최적화를 통한 자동 독순의 성능 향상

  • 이종석 (한국과학기술원 전자전산학부) ;
  • 박철훈 (한국과학기술원 전자전산학부)
  • Published : 2008.02.29

Abstract

This paper proposes a new multiobjective optimization method for discriminative training of hidden Markov models (HMMs) used as the recognizer for automatic lipreading. While the conventional Baum-Welch algorithm for training HMMs aims at maximizing the probability of the data of a class from the corresponding HMM, we define a new training criterion composed of two minimization objectives and develop a global optimization method of the criterion based on simulated annealing. The result of a speaker-dependent recognition experiment shows that the proposed method improves performance by the relative error reduction rate of about 8% in comparison to the Baum-Welch algorithm.

본 논문은 입술의 움직임을 통해 음성을 인식하는 자동 독순의 인식 성능 향상을 위해 인식기로 사용되는 은닉 마르코프 모델을 분별적으로 학습하는 기법을 제안한다. 기존에 많이 사용되는 Baum-Welch 알고리즘에서는 각 모델이 해당 클래스 데이터의 확률을 최대화하는 것을 목표로 학습시키는 반면, 제안하는 알고리즘에서는 클래스간의 분별력을 높이기 위해 두 가지의 최소화 목적함수로 이루어진 새로운 학습 목표를 정의하고 이를 달성하기 위해 모의 담금질 기법에 기반을 둔 다목적함수 전역 최적화 기법을 개발한다. 화자종속 인식 실험을 통해 제안하는 기법의 성능을 평가하며, 실험결과 기존의 학습 방법에 비해 오인식율을 상대적으로 약 8% 감소시킬 수 있음을 보인다.

Keywords

References

  1. L. A. Ross, D. Saint-Amour, V. M. Leavitt, D. C. Javitt, and J. J. Foxe, 'Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments,' Cerebral Cortex, Vol. 17, No. 5, pp. 1147-1153, 2007 https://doi.org/10.1093/cercor/bhl024
  2. C. C. Chibelushi, F. Deravi, and J. S. D. Mason, 'A review of speech-based bimodal recognition,' IEEE Trans. Multimedia, Vol. 4, No. 1, pp. 23-37, 2002 https://doi.org/10.1109/6046.985551
  3. L. Rabiner and B.-H. Juang, 'Fundamentals of Speech Recognition,' Prentice-Hall, 1993
  4. W. Chou, 'Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition,' Proc. IEEE, Vol. 88, No. 8, pp. 1201-1223, 2000 https://doi.org/10.1109/5.880080
  5. B.-H. Juang, W. Chou, and C.-H. Lee, 'Minimum classification error rate methods for speech recognition,' IEEE Trans. Speech and Audio Processing, Vol. 5, No. 3, pp. 257-265, 1997 https://doi.org/10.1109/89.568732
  6. A. Ben-Yishai and D. Burshtein, 'A discriminative training algorithm for hidden Markov models,' IEEE Trans. Speech and Audio Processing, Vol. 12, No. 3, pp. 204-216, 2004 https://doi.org/10.1109/TSA.2003.822639
  7. S. Kirkpatrick, C. D. Gerlatt, and M. P. Vecchi, 'Optimization by simulated annealing,' Science, Vol. 220, pp. 671-680, 1983 https://doi.org/10.1126/science.220.4598.671
  8. 이종석, 심선희, 김소영, 박철훈, '제어되지 않은 조명 조건하에서 입술움직임의 강인한 특징추출을 이용한 바이모달 음성인식,' Telecommunications Review, 14권 1호, pp. 123-134, 2004
  9. R. L. Yang, 'Convergence of the simulated annealing algorithm for continuous global optimization,' J. Optimization Theory and Applications, Vol. 104, No. 3, pp. 691-716, 2004
  10. H. H. Szu and R. L. Hartley, 'Fast simulated annealing,' Phys. Lett. A, Vol. 122, No. 3-4, pp. 157-162, June 1987 https://doi.org/10.1016/0375-9601(87)90796-1
  11. D. Nam, J.-S. Lee, and C. H. Park, 'n-dimensional Cauchy neighbor generation for the fast simulated annealing,' IEICE Trans. Information and Systems, Vol. E87-D, No. 11, pp. 2499-2502, 2004
  12. D. Nam and C. H. Park, 'Pareto-based costsimulated annealing for multiobjective optimization,' Proc. Asia-Pacific Conf. Simulated Evolution and Learning, Vol. 2, pp. 522-526, Singapore, 2002
  13. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller, 'Equation of state calculations by fast computing machines,' J. Chem Phys., Vol. 21, No. 6, pp. 1087-1092, 1953 https://doi.org/10.1063/1.1699114
  14. K. Deb, A. Pratap, and T. Meyarivan, 'A fast and elitist multiobjective genetic algorithm: NSGA-II,' IEEE Trans. Evolutionary Computation, Vol. 6, No. 2, pp. 182-197, Apr. 2002 https://doi.org/10.1109/4235.996017
  15. E. Zitzler, M. Laumanns, and S. Bleuler, 'A tutorial on evolutionary multiobjective optimization,' Metaheuristics for Multiobjective Optimisation, Lecture Notes in Economics and Mathematical Systems, X. Gandibleux, M. Sevaux, K. Sorensen, and V. T'kindt, Eds. Springer-Verlag, Vol. 535, pp. 3-37, 2004