DOI QR코드

DOI QR Code

Efficient Implementation of SVM-Based Speech/Music Classification on Embedded Systems

SVM 기반 음성/음악 분류기의 효율적인 임베디드 시스템 구현

  • Received : 2011.06.05
  • Accepted : 2011.10.19
  • Published : 2011.11.30

Abstract

Accurate classification of input signals is the key prerequisite for variable bit-rate coding, which has been introduced in order to effectively utilize limited communication bandwidth. Especially, recent surge of multimedia services elevate the importance of speech/music classification. Among many speech/music classifier, the ones based on support vector machine (SVM) have a strong selling point, high classification accuracy, but their computational complexity and memory requirement hinder their way into actual implementations. Therefore, techniques that reduce the computational complexity and the memory requirement is inevitable, particularly for embedded systems. We first analyze implementation of an SVM-based classifier on embedded systems in terms of execution time and energy consumption, and then propose two techniques that alleviate the implementation requirements: One is a technique that removes support vectors that have insignificant contribution to the final classification, and the other is to skip processing some of input signals by virtue of strong correlations in speech/music frames. These are post-processing techniques that can work with any other optimization techniques applied during the training phase of SVM. With experiments, we validate the proposed algorithms from the perspectives of classification accuracy, execution time, and energy consumption.

제한된 대역폭을 효율적으로 사용하기 위해서 도입된 가변 전송률은 먼저 신호의 정확한 분류를 필요로 한다. 특히 멀티미디어 서비스가 보편화 되면서 음성/음악 신호 분류의 중요성도 높아지게 되었다. 음성/음악 분류기 중, 서포트벡터머신 (SVM)을 이용한 분류기는 높은 분류 정확도로 주목받고 있다. 그러나 SVM는 많은 계산량과 저장 공간을 요구하므로 효율적인 구현이 요구되며, 특히 임베디드 시스템과 같이 자원이 제한 적인 경우에는 더욱 그러하다. 본 논문에서는 먼저 SVM을 이용한 음성/음악 분류기의 임베디드 시스템으로의 구현을 실행시간과 에너지소비의 관점에서 분석하고, 효율적인 구현을 위한 두가지 방법들을 제안한다. 서포트벡터의 판별결과에의 기여도를 바탕으로 기여도가 낮은 벡터들을 제외하는 방법과, 음성/음악 신호에 기본적으로 존재하는 각 프레임간의 상관관계를 이용하여 입력신호의 일부를 건너뛰는 방법이다. 이 기법들은 SVM의 학습 시 사용되는 다른 최적화 기법에 관계없이 적용이 가능하며, 실험을 통해 분류의 정확도, 실행시간, 그리고 에너지소비의 관점에서 그 성능을 증명하였다.

Keywords

References

  1. 3GPP2 Spec., "Source-controlled variable-rate multimedia wideband speech codec (VMR-WB), service option 62 and 63 for spread spectrum systems," 3GPP2-C.S0052-A, vol. 1.0, Apr. 2005.
  2. Y. Gao, E. Shlomot, A. Benyassine, J. Hyssen, Huan-yu Su, and C. Murgia, "The SMV algorithm selected by TIA and 3GPP2 for CDMA appications," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing , vol. 2, pp. 709-712, 2001.
  3. A. Bugatti, A. Flammini, and P. Migliorati, "Audio classification in speech and music: a comparison between statistical and a neural approach," EURASIP Journal on Appliled Signal Processing, vol. 2002, no. 4, pp. 372-378, 2002. https://doi.org/10.1155/S1110865702000720
  4. J. Saunders, "Real-time discrimination of broadcast speech/musicspeech/music," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 7-10, 1996.
  5. S. -K. Kim and J. -H. Chang, "Speech/music classification enhancement for 3GPP2 SMV codec based on support vector machine," IEICE Trans. Fundamentals of Electronics, Communications and Computer Sciences, Vol. E92-A, no. 2, 2009.
  6. S. -K. Kim and J. -H. Chang, "Discriminative weight training for support vector machine-based speech/music classification in 3GPP2 SMV codec," IEICE Trans. Fundamentals of Electronics, Communications and Computer Sciences , vol. E93-A, no. 1, pp. 316-319, 2010. https://doi.org/10.1587/transfun.E93.A.316
  7. H. Lee and J. Jeong, "Early termination scheme for binary block motion estimation," IEEE Trans. Consumer Electronics, vol. 53, no. 4, pp. 1682-1686, 2007. https://doi.org/10.1109/TCE.2007.4429270
  8. C. Burges, "Simplified support vector decision rules," in Proc. International Conference on Machine Learning, pp. 71-77, 1996.
  9. Y. Zhan and D. Shen, "Design efficient support vector machine for fast classification," Pattern Recognition, vol. 38, no. 1, pp. 157-161, 2005. https://doi.org/10.1016/j.patcog.2004.06.001
  10. T. Ho, "An efficient method for simplifying support vector machines," in Proc. International Conference on Machine Learning, pp. 617-624, 2005.
  11. N. E. Ayat, M. Cheriet, and C. Y. Suen, "Automatic model selection for the optimization of SVM kernel," Pattern Recognition, vol. 38, no. 10, pp. 1733-1745, 2005. https://doi.org/10.1016/j.patcog.2005.03.011
  12. T. Austin, T. Mudge, and D. Grunwald, Sim-panalyzer. http://www.eecs.umich.edu/-panalyzer/
  13. W. M. Fisher, G. R. Doddington and K. M. Goudie-Marshall, "The DARPA speech recognition research database: Specifications and status," in Proc. DARPA Workshop Speech Recognition , pp. 93-99, 1986.