Gaussian Mixture Model using Minimum Classification Error for Environmental Sounds Recognition Performance Improvement

Minimum Classification Error 방법 도입을 통한 Gaussian Mixture Model 환경음 인식성능 향상

  • 한다정 (전남대학교 전자컴퓨터공학부) ;
  • 박아론 (전남대학교 전자컴퓨터공학부) ;
  • 박준규 (전남대학교 전자컴퓨터공학부) ;
  • 백성준 (전남대학교 전자컴퓨터공학부)
  • Received : 2011.09.22
  • Accepted : 2011.12.09
  • Published : 2011.12.28


In this paper, we proposed the MCE as a GMM training method to improve the performance of environmental sounds recognition. We model the environmental sounds data with newly defined misclassification function using the log likelihood of the corresponding class and the log likelihood of the rest classes for discriminative training. The model parameters are estimated with the loss function using GPD(generalized probabilistic descent). For recognition performance comparison, we extracted the 12 degrees features using preprocessing and MFCC(mel-frequency cepstral coefficients) of the 9 kinds of environmental sounds and carry out GMM classification experiments. According to the experimental results, MCE training method showed the best performance by an average of 87.06% with 19 mixtures. This result confirmed us that MCE training method could be effectively used as a GMM training method in environmental sounds recognition.


Context Aware;Environmental Sounds;GMM;MLE;Minimum Classification Error


Supported by : 한국연구재단


  1. 한국정보화진흥원 국가정보화기획단 정보화전략 기획부, 스마트 시대의 패러다임 변화 전망과 ICT 전략, 한국정보화진흥원, 2010.
  2. B. Schilit, N. Adams, and R. Want, "Context -aware computing applications," In proceedings of IEEE Workshop on Mobile Computing Systems and Applications, pp.85-90, 1994.
  3. 홍일영, 상황인지 소프트웨어, 이젠 몸짓을 넘어 마음을 읽어야한다, 한국소프트웨어진흥원, 2008.
  4. S. Chu, S. Narayanan, and C. C. Jay Kuo, "Environmental Sound Recognition With Time-Frequency Audio Features," IEEE Trans. on Audio, Speech, and Language Processing, Vol.17, No.6, pp.1-16, 2009.
  5. 박준규, 백성준, "후처리를 이용한 환경음 인식 성능 개선," 한국콘텐츠학회, 제10권, 제7호, pp.31-39, 2010.
  6. 박준규, 백성준, "멀티 신호를 이용한 환경 인식 성능 개선," 한국콘텐츠학회, 제10권, 제12호, pp.27-33, 2010.
  7. S. Chu, S. Narayanan, and C. C. Jay Kuo, "Environmental sound recognition using MP-based features," IEEE Internationl Conference on Acoustics, Speech and Signal Processing, pp.1-4, 2008.
  8. M. Cowling and R. Sitte, "Comparison of techniques for environmental sound recognition," Pattern Recognition Letters, Vol.24, No.15, pp.2895-2907, 2003.
  9. A. Eronen, V. Peltonen, J. Tuomi, A. Klapuri, S. Fagerlund, T. Sorsa, G. Lorho, and J. Huopaniemi, "Audio-Based Context Recognition," IEEE Trans. on Audio, Speech, and Language Processing, Vol.14, No.1, pp.321-329, 2006.
  10. B. H. Juang, W. Chou, and C. H. Lee, "Minimum classification error rate methods for speech recognition," IEEE Trans. Speech Audio Process, Vol.5, No.3, pp.257-265, 1997.
  11. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, John Wiley & Sons, 2001.
  12. B. H. Juang and S. Katagiri, "Discriminative learning for minimum error classification," IEEE Trans. signal processing, Vol.40, No.12, pp.3043-3054, 1992.
  13. C. Ma and E. Chang, "Comparison of discriminative training methods for speaker verification," IEEE International conference, Acoustic, Speech and Signal processing, Vol.1, pp.192-195, 2003.
  14. Yusuke Kida and Hiroyoshi Yamamoto, "Minimum classification error interactive training for speaker Identification," IEEE International conference, Acoustic, Speech and Signal processing, Vol.1, pp.641-644, 2005.
  15. C. Miyajima, K. Tokuda, and T. Kitamura, "Minimum classification error training for speaker identification using gaussian mixture models based on multi-space probability distribution," EUROSPEECH, Vol.4, pp.2837-2840, 2001.