DOI QR코드

DOI QR Code

음성인식 기반 응급상황관제

Emergency dispatching based on automatic speech recognition

  • 투고 : 2016.05.30
  • 심사 : 2016.06.22
  • 발행 : 2016.06.30

초록

In emergency dispatching at 119 Command & Dispatch Center, some inconsistencies between the 'standard emergency aid system' and 'dispatch protocol,' which are both mandatory to follow, cause inefficiency in the dispatcher's performance. If an emergency dispatch system uses automatic speech recognition (ASR) to process the dispatcher's protocol speech during the case registration, it instantly extracts and provides the required information specified in the 'standard emergency aid system,' making the rescue command more efficient. For this purpose, we have developed a Korean large vocabulary continuous speech recognition system for 400,000 words to be used for the emergency dispatch system. The 400,000 words include vocabulary from news, SNS, blogs and emergency rescue domains. Acoustic model is constructed by using 1,300 hours of telephone call (8 kHz) speech, whereas language model is constructed by using 13 GB text corpus. From the transcribed corpus of 6,600 real telephone calls, call logs with emergency rescue command class and identified major symptom are extracted in connection with the rescue activity log and National Emergency Department Information System (NEDIS). ASR is applied to emergency dispatcher's repetition utterances about the patient information. Based on the Levenshtein distance between the ASR result and the template information, the emergency patient information is extracted. Experimental results show that 9.15% Word Error Rate of the speech recognition performance and 95.8% of emergency response detection performance are obtained for the emergency dispatch system.

키워드

참고문헌

  1. Alumae, T. (2014). Full-duplex Speech-to-text system for Estonian. Proceedings of Baltic HLT (pp. 3-10).
  2. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707-710.
  3. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., & Vesely, K. (2011). The Kaldi speech recognition toolkit. Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding.
  4. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., & Vesely, K. (2011). Online decoding in Kaldi. Retrieved from http://kaldi-asr.org/doc/online_decoding.html on February 25, 2016.
  5. Schuster, M. & Nakajima, K. (2012). Japanese and Korean voice search. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 5149-5152).
  6. Stolcke, A., Zheng, J., Wang, W., & Abrash, V. (2011). SRILM at sixteen: Update and outlook. Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding.
  7. Zhang, X., Trmal, J., Povey, D. & Khudanpur, S. (2014). May. Improving deep neural network acoustic models using generalized maxout networks. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 215-219).
  8. Chung, M., Lee, K., Shin, D., Chung, J., & Kang, K. (2016). Applying Speech Recognition Technology to Emergency Dispatch. Proceedings of the Korean Institute of Fire Science and Engineering Conference (pp. 77-78). (정민화.이규환.신대진.정지오.강경희 (2016). 음성인식기술의 응급상황관제 적용, 한국화재소방학회 춘계학술대회 논문집, 77-78.)
  9. Jang, Y., Kang, K., Kim, J., & Kim, K. (2016). A study of Korean symptom expression in emergency call. Proceedings of the Korean Institute of Fire Science and Engineering Conference, 75-76. (장윤희.강경희.김준태.김경혜 (2016). 구급 신고 전화에서의 한국어 증상 표현 연구, 한국화재소방학회 춘계학술대회 논문집, 75-76.)
  10. Kang, K. (2015). A comparison of information gathering and protocol in dispatching. Proceedings of the Korean Institute of Fire Science and Engineering Conference (pp. 119-120). (강경희 (2015). 구급 신고 접수와 수보요원의 도입부 수보 프로토콜의 비교, 한국화재소방학회 추계학술대회 논문집, 119-120.)
  11. Kim, K., Lee, D., Lim, M., & Kim, J. (2015). Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model. Phonetics and Speech Sciences, 7(4), 3-8. (김광호.이동현.임민규.김지환 (2015). Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소. 말소리와 음성과학, 7(4), 3-8.) https://doi.org/10.13064/KSSS.2015.7.4.003