DOI QR코드

DOI QR Code

Survey on Out-Of-Domain Detection for Dialog Systems

대화시스템 미지원 도메인 검출에 관한 조사

  • Jeong, Young-Seob (Department of Big Data Engineering, Soonchunhyang University) ;
  • Kim, Young-Min (Department of Big Data Engineering, Soonchunhyang University)
  • 정영섭 (순천향대학교 빅데이터공학과) ;
  • 김영민 (순천향대학교 빅데이터공학과)
  • Received : 2019.07.02
  • Accepted : 2019.09.20
  • Published : 2019.09.27

Abstract

A dialog system becomes a new way of communication between human and computer. The dialog system takes human voice as an input, and gives a proper response in voice or perform an action. Although there are several well-known products of dialog system (e.g., Amazon Echo, Naver Wave), they commonly suffer from a problem of out-of-domain utterances. If it poorly detects out-of-domain utterances, then it will significantly harm the user satisfactory. There have been some studies aimed at solving this problem, but it is still necessary to study about this intensively. In this paper, we give an overview of the previous studies of out-of-domain detection in terms of three point of view: dataset, feature, and method. As there were relatively smaller studies of this topic due to the lack of datasets, we believe that the most important next research step is to construct and share a large dataset for dialog system, and thereafter try state-of-the-art techniques upon the dataset.

대화시스템은 인간과 컴퓨터 사이의 새로운 의사소통 수단으로 떠오르고 있다. 대화시스템은 인간의 음성을 입력으로 취하여, 적절한 음성 답변 또는 서비스를 제공하게 된다. 아마존 에코, 네이버 웨이브 등과 같은 대화시스템 제품들이 등장하고 있음에도 불구하고, 이 대화시스템들은 공통적으로 미지원 도메인을 제대로 처리하지 못한다는 문제점을 안고 있다. 이와 관련한 몇몇 연구들이 있었지만, 이 문제를 풀기 위한 더욱 많은 연구가 진행될 필요가 있다. 이 논문에서는, 미지원 도메인 검출과 관련한 기존 연구들에 대하여 3가지 관점, 즉 데이터, 자질, 방법에 대한 관점으로 요약한 정보를 제공한다. 데이터셋이 부족하다는 점으로 인해 타 연구분야에 비해 적은 연구가 수행되어왔으므로, 앞으로 가장 시급한 연구 주제는 대화시스템의 미지원 도메인 검출을 위한 공개용 데이터셋을 구축하고 배포하는 것이다.

Keywords

References

  1. Google. http://www.google.com
  2. Yahoo. http://www.yahoo.com
  3. M. S. Seigel. (2013). Confidence Estimation for Automatic Speech Recognition Hypotheses. Doctoral dissertation, St Edmund's College.
  4. J. C. Chappelier, M. Rajman, R. Aragues. & A. Rozenknop. (1999). Lattice Parsing for Speech Recognition. Traitement Automatique du Langage Naturel, 95-104.
  5. A. Graves. & N. Jaitly. (2014). Towards End-to-End Speech Recognition with Recurrent Neural Networks. Proceedings of the 31th International Conference on Machine Learning, 1764-1772.
  6. H. Khouzaimi, R. Laroche & F. Lefevre. (2014). An Easy Method to Make Dialogue Systems Incremental. Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 98-107.
  7. C. Shi, M. Verhagen & J. Pustejovsky. (2014). A Conceptual Framework of Online Natural Language Processing Pipeline Application. Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT, 53-59.
  8. X. Liu., R. Sarikaya., L. Zhao., Y. Ni. & Y. C. Pan. (2016). Personalized Natural Language Understanding. Proceedings of the 17th Annual Conference of the International Speech Communication Association, 1146-1150.
  9. D. Wang., D. H. Tur & G. Tur. (2013). Understanding Computer-Directed Utterances in Multi-User Dialog Systems. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 8377-8381.
  10. P. Xu. & R. Sarikaya. (2014). Contextual Domain Classification in Spoken Language Understanding Systems Using Recurrent Neural Network. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 136-140.
  11. C. Lee., S. Jung., S. Kim & G. G. Lee. (2009). Example-Based Dialog Modeling for Practical Multi-Domain Dialog System. Speech Communication, 51, 466-484. https://doi.org/10.1016/j.specom.2009.01.008
  12. R. Meena. (2016). Data-Driven Methods for Spoken Dialogue Systems. Doctoral dissertation, KTH Royal Institute of Technology.
  13. B. E. Boser, I. M. Guyon & V. N. Vapnik. (1992). A Training Algorithm For Optimal Margin Classifiers. Proceedings of the fifth Annual Workshop on Computational Learning Theory, 144-152.
  14. G. Tur, A. Deoras & D. Hakkani-Tur. (2014). Detecting Out-Of-Domain Utterances Addressed to A Virtual Personal Assistant. Proceedings of the 15th Annual Conference of the International Speech Communication Association, 283-287.
  15. E. Shriberg, A. Stolcke, D. Hakkani-Tur & L. Heck. (2012). Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog. Proceedings of the 13th Annual Conference of the International Speech Communication Association, 334-337.
  16. A. Stolcke et al. (2000). Dialogue Act Modeling For Automatic Tagging and Recognition of Conversational Speech. Computational Linguistics, 26(3), 339-373. https://doi.org/10.1162/089120100561737
  17. M. Core. & J. Allen. (1997). Coding Dialogs With the DAMSL Annotation Scheme. Proceedings of the Working Notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines.
  18. I. R. Lane., T. Kawahara., T. Matsui. & S. Nakamura. (2004). Out-Of-Domain Detection Based On Confidence Measures From Multiple Topic Classification. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 757-760.
  19. I. R. Lane., T. Kawahara., T. Matsui. & S. Nakamura. (2004). Topic Classification and Verification Modeling For Out-Of-Domain Utterance Detection. Proceedings of the 8th International Conference on Spoken Language Processing.
  20. I. R. Lane. & T. Kawahara. (2005). Incorporating Dialogue Context and Topic Clustering in Out-of-Domain Detection. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1045-1048.
  21. X. Zhang., J. Zhao. & Y. Lecun. (2015). Character-Level Convolutional Networks for Text Classification. Proceedings of the 28th International Conference on Neural Information Processing Systems, 649-657.
  22. Y. Fujita. S. Takeuchi, H. Kawanami, T. Matsui, H. Saruwatari & K. Shikano. (2011). Out-of-Task Utterance Detection Based on Bag-of-Words Using Automatic Speech Recognition Results. Proceedings of the third Annual Summit and Conference of Asia-Pacific Signal and Information Processing Association.
  23. Y. S. Jeong. (2017). Experimental Analysis for Out-Of-Domain Detection Using Features of Word Positions in Sentence. Proceedings of the Spring Conference of Korean Society for Internet Information, 18(1).
  24. Wordnet. https://wordnet.princeton.edu/
  25. D. Hogan, J. Leveling, H. Wang, P. Ferguson & C. Gurrin. (2013). SMS Normalisation, Retrieval and Out-of-Domain Detection Approaches for SMS-Based FAQ Retrieval. Multilingual Information Access in South Asian Languages, 184-196.
  26. S. Ryu, D. Lee, G. G. Lee, K. Kim & H. Noh. (2014). Exploiting Out-Of-Vocabulary Words For Out-Of-Domain Detection in Dialog Systems. Proceedings of the International Conference on Big Data and Smart Computing, 165-168.
  27. M. Nakano, S. Sato, K. Komatani, K. Matsuyama, K. Funakoshi & H. G. Okuno. (2011). A Two-Stage Domain Selection Framework for Extensible Multi-Domain Spoken Dialogue Systems. Proceedings of the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 18-29.
  28. Springer. (2003). The Elements of Statistical Learning. Berlin: T. Hastie., R. Tibshirani. & J. Friedman.
  29. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer & R. Harshman. (1990). Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6), 391-407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
  30. D. A. Freedman. (2009). Statistical Models: Theory and Practice. Cambridge University Press.
  31. A. L. Berger, S. A. D. Pietra & V. J. D. Pietra. (1996). A Maximum Entropy Approach to Natural Language Processing. Computational Linguistics, 22(1), 39-71.
  32. S. Ryu, S. Kim, J. Choi, H. Yu & G. G. Lee. (2017). Neural Sentence Embedding Using Only In-Domain Sentences for Out-Of-Domain Sentence Detection in Dialog Systems. Pattern Recognition Letter, 88, 26-32. https://doi.org/10.1016/j.patrec.2017.01.008
  33. S. Hochreiter. & J. Schmidhuber. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  34. J. K. Kim. & Y. B. Kim. (2018). Joint Learning of Domain Classification and Out-of-Domain Detection with Dynamic Class Weighting for Satisficing False Acceptance Rates. Proceedings of 19th Annual Conference of the International Speech Communication Association, 556-560.
  35. Y. S. Jeong. (2018). Out-Of-Domain Detection Using Hierarchical Dirichlet Process. Journal of The Korea Society of Computer and Information, 23(1), 17-24. https://doi.org/10.9708/JKSCI.2018.23.01.017