- Volume 39 Issue 2
DOI QR Code
A study on user defined spoken wake-up word recognition system using deep neural network-hidden Markov model hybrid model
Deep neural network-hidden Markov model 하이브리드 구조의 모델을 사용한 사용자 정의 기동어 인식 시스템에 관한 연구
- Yoon, Ki-mu ;
- Kim, Wooil (Department of Computer Science & Engineering, Incheon National University)
- Received : 2020.01.23
- Accepted : 2020.03.04
- Published : 2020.03.31
Wake Up Word (WUW) is a short utterance used to convert speech recognizer to recognition mode. The WUW defined by the user who actually use the speech recognizer is called user-defined WUW. In this paper, to recognize user-defined WUW, we construct traditional Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), Linear Discriminant Analysis (LDA)-GMM-HMM and LDA-Deep Neural Network (DNN)-HMM based system and compare their performances. Also, to improve recognition accuracy of the WUW system, a threshold method is applied to each model, which significantly reduces the error rate of the WUW recognition and the rejection failure rate of non-WUW simultaneously. For LDA-DNN-HMM system, when the WUW error rate is 9.84 %, the rejection failure rate of non-WUW is 0.0058 %, which is about 4.82 times lower than the LDA-GMM-HMM system. These results demonstrate that LDA-DNN-HMM model developed in this paper proves to be highly effective for constructing user-defined WUW recognition system.
Supported by : 인천대학교
- V. Z. Kepuska and T. B. Klein, "A novel Wake-Up-Word speech recognition system, Wake-up-Word recognition task, technology and evaluation," Nonlinear Analysis, 71, e2772-e2789 (2009). https://doi.org/10.1016/j.na.2009.06.089
- F. Ge and Y. Yan, "Deep neural network based Wake- Up-Word speech recognition with two-stage detection," Proc. ICASSP. 2761-2765 (2017).
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. -r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, 29, 82-97 (2012).
- S. Mika, G. Ratsch , J. Weston, B. Scholkopf, and K. R. Mullers, "Fisher discriminant analysis with kernels," Proc. IEEE Neural Networks for Signal Processing Workshop, 711-720 (1999).
- ETSI ES 201 108, ETSI Standard Document, v1.1.2 (2000-04)., 2000.