DOI QR코드

DOI QR Code

A New Speech Quality Measure for Speech Database Verification System

음성 인식용 데이터베이스 검증시스템을 위한 새로운 음성 인식 성능 지표

  • Received : 2016.01.20
  • Accepted : 2016.03.03
  • Published : 2016.03.31

Abstract

This paper presents a speech recognition database verification system using speech measures, and describes a speech measure extraction algorithm which is applied to this system. In our previous study, to produce an effective speech quality measure for the system, we propose a combination of various speech measures which are highly correlated with WER (Word Error Rate). The new combination of various types of speech quality measures in this study is more effective to predict the speech recognition performance compared to each speech measure alone. In this paper, we increase the system independency by employing GMM acoustic score instead of HMM score which is obtained by a secondary speech recognition system. The combination with GMM score shows a slightly lower correlation with WER compared to the combination with HMM score, however it presents a higher relative improvement in correlation with WER, which is calculated compared to the correlation of each speech measure alone.

Keywords

Word error rate;Correlation coefficient;Performance prediction;Speech recognition;Speech quality measure

References

  1. S. -Y. Yoon, L. Chen and K. Zechner, "Predicting Word Accuracy for the Automatic Speech Recognition of Non-native Speech," Interspeech-2010, pp. 773-776, 2010.
  2. W. Kim and J. H. L. Hansen, "Phonetic Distance Based Confidence Measure," Signal Processing Letters, IEEE vol. 17, no. 2 , pp. 121-124, Feb. 2010. https://doi.org/10.1109/LSP.2009.2034551
  3. S. Ji and W. Kim, "A Study on Speech Measure Analysis for Speech Recognition Accuracy Estimation in Noisy Environments," A Conference of Acoustical Society of Korea, vol. 34, no. 1, pp. 46, May 2015.
  4. S. Ji, J. Cho and W. Kim, "Development of Database Verification System for Automatic Speech Recognition," KCC2015, vol. 34, pp. 719-720, June 2015.
  5. S. Ji and W. Kim, "A Study on Effective Speech Recognition Performance Measure using MFCC Similarity," KSCSP-2015, vol. 32, no. 1, pp.220-222, Aug. 2015.
  6. Tcl Developer Xchange. Tcl/tk Software and download page [Internet]. Available: http://www.tcl.tk/software/tcltk
  7. SNACK Sound Toolkit developed by KTH Royal Institute of Technology. Snack software and tutorial download page [Internet]. Available: http://www.speech.kth.se/snack
  8. Y. Hu and P. C. Loizou, "Evaluation of Objective Measure for Speech Enhancement," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 16, no. 1, pp. 229-238, Sep. 2008. https://doi.org/10.1109/TASL.2007.911054
  9. Hidden Markov Model Toolkit (HTK) developed by Cambridge University. HTK software and tutorial download page [Internet]. Available: http://htk.eng.ca0m.ac.uk
  10. SPHINX project by Carnegie Mellon University. SPHINX software and tutorial download page [Internet]. Available: http://cmusphinx.sourceforge.net
  11. STNR technique provided by National Institute of Standards and Technology(NIST) [Internet]. Available: http://www.nist.gov/speech

Acknowledgement

Supported by : Ministry of Land, Infrastructure and Transport of Korean government