DOI QR코드

DOI QR Code

A Method for Selecting Voice Game Commands to Maximize the Command Distance

명령어간 거리를 최대화하는 음성 게임 명령어의 선택 방법

  • Kim, Sangchul (Div. of Computer and Electronic Sys. Engineering, Hankuk Univ. of Foreign Studies)
  • 김상철 (한국외국어대학교 컴퓨터및전자시스템 공학부)
  • Received : 2019.06.05
  • Accepted : 2019.07.23
  • Published : 2019.08.20

Abstract

Recently interests in voice game commands have been increasing due to the diversity and convenience of the input method, but also by the distance between commands. The command distance is the phonetic difference between command utterances, and as such distance increases, the recognition rate improves. In this paper, we propose an IP(Integer Programming) modeling of the problem which is to select a combination of commands from given candidate commands for maximizing the average distance. We also propose a SA(Simulated Annealing)-based algorithm for solving the problem. We analyze the characteristics of our method using experiments under various conditions such as the number of commands, allowable command length, and so on.

최근 입력 방식의 다양성이나 편리성 때문에 음성 게임 명령어에 대한 관심이 증가하고 있다. 음성 명령어의 인식률은 인식 엔진의 성능뿐만이 아니라, 명령어간의 거리에도 영향을 받는다. 명령어간 거리란 명령어 발음간의 음성적 차이를 말하는데, 이 거리가 클수록 인식률이 높아진다. 본 논문에서 우리는 명령별 명령어 후보들이 주어졌을 때 명령어간의 평균 거리를 최대화하는 명령어 조합을 선택하는 문제를 IP(Integer Programming)으로 모델링한다. 또한 명령어 선택 문제의 해를 구하는 SA(Simulated Annealing) 기반의 방법을 제안한다. 우리의 방법을 명령어 수, 허용되는 명령어 길이 등의 다양한 조건에 하에서 실험한 결과를 토대로 특징을 분석한다.

Keywords

References

  1. M. Mohri, "Edit-Distance of Weighted Automata", Int'l Conf. on Implementation and Application of Automata, 2002, pp.1-23.
  2. "Simulated Annealing", https://en.wikipedia.org/wiki/Simulated_annealing, accessed July 10, 2019.
  3. L.R. Rabiner, "A tutorial on Hidden Markov Models and selected applications in speech recognition," Proceedings of the IEEE, 77, No.2, 1989, pp.257-286. https://doi.org/10.1109/5.18626
  4. "Speech Recognition", http://en.wikipedia.org /wiki/Speech_recognition, accessed July 10, 2019.
  5. T. Schatz1, N. H. Feldman, "Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception", Proceedings of the Conference on Cognitive Computational Neuroscience, 2018. pp.1-4.
  6. "Tom Clancy's Endwar", http://endwargame.us.ubi.com/, accessed July 10, 2019.
  7. "Tom Clancy's H.A.W.X", http://www.hawxgame.com/, accessed July 10, 2019.
  8. "Karaoke Revolution", https://en.wikipedia.org/wiki/Karaoke_Revolution, accessed July 10, 2019.
  9. "Shout n Dodge", https://www.playitontheweb.com/games/Shout-n-Dodge-game.htm,accessed July 10, 2019.
  10. "Racing Pitch", http://jet.ro/games/racing-pitch/, accessed July 10, 2019.
  11. Sporka, A.J., Kurniawan, S.H., Mahmud, M., Slavík, P., "Non-speech input and speech recognition for real-time control of computer games", Proc. 8th International ACM SIGACCESS Conference on Computers and Accessibility, 2006, pp. 213-220.
  12. S. Harada, et. al., "Voice games: investigation into the use of non-speech voice input for making computer games more accessible", Proc. of 13th IFIP TC 13 Int'l Conf. on Human-computer interaction, 2011.
  13. D. Park, S. Kim, "A HMM-based Method of Reducing the Time for Processing Sound Commands in Computer Games", Journal of Computer Game Society, 16(2), 2016, pp.119-128
  14. H. Nanjo el al., "A fundamental study of novel speech interface for computer games," IEEE 13lh Int'l Symp. on Consumer Electronics, 2009, pp. 558-560.
  15. A. Janicki, D. Wawer, "Automatic Speech Recognition for Polish in a Computer Game Interface", Proc, Federated Conference on Computer Science and Information Systems, 2011, pp.711-716.
  16. H. Hyyro, "A bit-vector algorithm for computing Levenshtein and Damerau edit distances", Nordic Journal of Computing, 10(1), 2003, pp.29-39.
  17. A. Cutler, A. Weber, R. Smits, and N. Cooper, "Patterns of english phoneme confusions by native and non-native listeners," J. Acoust. Soc. Am., vol. 116, pp. 3668-3678, 2004. https://doi.org/10.1121/1.1810292
  18. J. Goldberger, H. Aronowitz, "A distance measure between GMMs based on the unscented transform and its application to speaker recognition," in Proc. of INTERSPEECH, 2005, pp. 1985-1988.
  19. S.D. Peters, et. al., "On the limits of speech recognition in noise", Proc. of ICASSP, 1999, pp. 365-368.