DOI QR코드

DOI QR Code

A Study on Quantitative Evaluation Method for STT Engine Accuracy based on Korean Characteristics

한국어 특성 기반의 STT 엔진 정확도를 위한 정량적 평가방법 연구

  • Received : 2020.06.11
  • Accepted : 2020.07.03
  • Published : 2020.07.31

Abstract

With the development of deep learning technology, voice processing-related technology is applied to various areas, such as STT (Speech To Text), TTS (Text To Speech), ChatBOT, and intelligent personal assistant. In particular, the STT is a voice-based, relevant service that changes human languages to text, so it can be applied to various IT related services. Recently, many places, such as general private enterprises and public institutions, are attempting to introduce the relevant technology. On the other hand, in contrast to the general IT solution that can be evaluated quantitatively, the standard and methods of evaluating the accuracy of the STT engine are ambiguous, and they do not consider the characteristics of the Korean language. Therefore, it is difficult to apply the quantitative evaluation standard. This study aims to provide a guide to an evaluation of the STT engine conversion performance based on the characteristics of the Korean language, so that engine manufacturers can perform the STT conversion based on the characteristics of the Korean language, while the market could perform a more accurate evaluation. In the experiment, a 35% more accurate evaluation could be performed compared to the existing methods.

딥러닝 기술의 발전으로 STT(Speech To Text), TTS(Text To Speech), 챗봇(ChatBOT), 인공지능 비서 등 다양한 분야에 음성처리 관련 기술이 적용되고 있다. 특히, STT는 음성 기반 관련 서비스의 기반이며, 인간의 언어를 텍스트로 변환시키기 때문에 IT관련 서비스에 대한 다양한 응용을 할 수 있다. 따라서 최근 일반 사기업, 공공기관 등 여러 수요처에서 관련 기술에 대한 도입을 시도하고 있다. 하지만 정량적으로 수준을 평가할 수 있는 일반적인 IT 솔루션과는 달리 STT엔진에 대한 정확성을 평가하는 기준과 방법이 모호하며 한국어의 특성을 고려하지 않기 때문에 정량적인 평가 기준 적용이 어렵다. 따라서 본 연구에서는 한국어의 특성에 기반한 STT엔진 변환 성능 평가에 대한 가이드를 제공함으로써 엔진제작사는 한국어 특성에 기반한 STT변환을 수행 할 수 있으며, 수요처에서는 더 정확한 평가를 수행할 수 있다. 실험 데이터에서 기존 방식에 비해 35% 더 정확한 평가를 수행할 수 있다.

Keywords

References

  1. P. Achananuparp, et al., "The evaluation of sentence similarity measures." Data Warehousing and Knowledge Discovery, Springer Berlin Heidelberg, pp. 305-316, 2008. DOI : http://dx.doi.org/10.1007/978-3-540-85836-2_29
  2. T. Mikolov, et al., "Distributed representations of words and phrases and their compositionality," In Proc. of Advances in Neural Information Processing Systems, pp. 3111-3119, 2013.
  3. J. Wang, G. Li and J. Fe, "Fast-Join: An Efficient Method for Fuzzy Token Matching based String Similarity Join", In ICDE, 2011.
  4. Lee Mi-suk, "A copy detection system," Ph.D. dissertation, University of Dongguk, Seoul, Korea, 2005.
  5. Manning, C. D.; Raghavan, P.; Schutze, H. . Cambridge University Press. 100-123. ISBN 9780521865715. Scoring, term weighting, and the vector space model
  6. Lee Mi-suk, "A copy detection system," Ph.D. dissertation, University of Dongguk, Seoul, Korea, 2005.
  7. Koopman B, Zuccon G, Bruza P, Sitbon L, Lawley M: An evaluation of corpus-driven measures of medical concept similarity for information retrieval. Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York: ACM, 2439-2442, 2012. DOI : http://dx.doi.org/10.1145/2396761.2398661
  8. T. Mikolov et al., "Distributed Representations of Words and Phrases and their Compositionality", Int. Conf. NIPS, pp. 3111-3119, 2013.
  9. O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, pages 2177-2185, 2014.
  10. DongKeonLee, O KyoJoongOh, Ho-Jin Choi, Measuring the Syntactic Similarity between Korean Sentences Using RNN, KCC, 2016.
  11. P. Achananuparp, et al., "The evaluation of sentence similarity measures." Data Warehousing and Knowledge Discovery, Springer Berlin Heidelberg, pp. 305-316, 2008. DOI : http://dx.doi.org/10.1007/978-3-540-85836-2_29
  12. Wo Hyun Jung, Soo Jin Park, Word and coding-unit superiority effect in the perception of Korean Letter, The Korean Psychological Association. 18-2, pp.139-156, 2006.