Parts-Based Feature Extraction of Spectrum of Speech Signal Using Non-Negative Matrix Factorization

  • Park, Jeong-Won (Department of Electronic Engineering, Dong-A University) ;
  • Kim, Chang-Keun (Department of Electronic Engineering, Dong-A Universit) ;
  • Lee, Kwang-Seok (Department of Electronic Engineering, Jinju National Universit) ;
  • Koh, Si-Young (School of Electronic Information and Communication Engineering, Kyungil Universit) ;
  • Hur, Kang-In (Department of Electronic Engineering, Dong-A University)
  • Published : 2003.12.01


In this paper, we proposed new speech feature parameter through parts-based feature extraction of speech spectrum using Non-Negative Matrix Factorization (NMF). NMF can effectively reduce dimension for multi-dimensional data through matrix factorization under the non-negativity constraints, and dimensionally reduced data should be presented parts-based features of input data. For speech feature extraction, we applied Mel-scaled filter bank outputs to inputs of NMF, than used outputs of NMF for inputs of speech recognizer. From recognition experiment result, we could confirm that proposed feature parameter is superior in recognition performance than mel frequency cepstral coefficient (MFCC) that is used generally.


Non-Negative Matrix Factorization;Parts-based Feature Extraction;Mel-scaled Filter Bank Output


  1. Daniel D. Lee, H. Sebastian Seung, 'Algorithms for Non-Negative Matrix Factorization', in Advances in Neural Information Processing System 13, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds., 2001
  2. S. Tsuge, M. Shishibori, S. Kurojwa, K. Kita, 'Dimensionally Reduction Using Non-Negative Matrix Factorization for Information Retrieval', Systems, Man, and Cybermetics, 2001 IEEE International Conference on, vol. 2, 2001, pp. 960-965
  3. Simon Haykin, 'Neural Networks a Comprehensive Foundation', Prentice Hall, 1999
  4. L. R. Rabiner, B. H. Juang, 'Fundamentals of Speech Recognition', Prentice Hall, 1993
  5. Daniel D. Lee and H. Sebastian Seung, 'Learning the parts of objects by non-negative matrix factorization,' Nature vol. 401, Oct. 21, 1999, pp-788-791
  6. D. Guillamet, B. Schiele, J. Vitria, 'Analyzing nonnegative matrix factorization for image classification', Pattern Recognition, 2002. Proceedings. 16th international Conference on, vol. 2, Aug. 2002, pp. 116-119
  7. Sven Behnke, 'Discovering hierarchical speech features using convolutional non-negative matrix factorization', IJCNN'03, vol. 4, Oct. 14, 2003, pp. 2758-2763
  8. L. R. Rabiner, R. W. Schafer, 'Digital Processing of Speech Signals', Prentice Hall, 1978
  9. J. W. Park, P. W. Kim, C. K. Kim, K. I. Hur, 'Adoption of Support Vector Machine and Independent Component Analysis for Implementation of Speech Recognizer', Summer Conference of lEEK, vol. 26, no.1, July, 2003, pp. 2164-2167
  10. Hoyer. P. O, 'Non-Negative Sparse Coding', Neural Networks for Signal Processing, 2002. Proceedings of the 2002 $12^{th}$ IEEE Workshop on, 2002, pp. 557-565
  11. H. Y. Choi, S. J. Choi, 'Learning the Sparse Codes of Speeches via Non-Negative Matrix Factorization, CVPR 2002