DOI QR코드

DOI QR Code

한글 단어의 음성 인식 처리에 관한 연구

A Study on Processing of Speech Recognition Korean Words

  • Nam, Kihun (Dept. Computer Engineering, SeoKyeong Univ)
  • 투고 : 2019.09.13
  • 심사 : 2019.10.15
  • 발행 : 2019.11.30

초록

본 논문에서는 한글 단어 단위의 음성 인식 처리 기술을 제안한다. 음성 인식은 마이크와 같은 센서를 사용하여 얻은 음향학적 신호를 단어나 문장으로 변환시키는 기술이다. 대부분의 외국어들은 음성 인식에 있어서 어려움이 적은 편이다. 그에 반면, 한글의 모음과 받침 자음 구성이어서 음성 합성 시스템으로부터 얻은 문자를 그대로 사용하기에는 부적절하다. 기존 구조의 음성 인식 기술을 개선해야만 보다 정확하게 단어를 인식할 수 있다. 이러한 문제를 해결하기 위해 기존 방식의 음성 인식구조에 새로운 알고리즘을 추가하여 음성 인식률을 높이게 하였다. 먼저 입력된 단어를 전처리 과정을 수행한 후 결과를 토큰 처리한다. 레벤스테인 거리 알고리즘과 해싱 알고리즘에서 처리된 결과 값을 조합한 후 자음 비교 알고리즘을 거쳐 표준 단어를 출력한다. 최종 결과 단어를 표준화 테이블과 비교하여 존재하면 출력하고 존재하지 않으면 테이블에 등록하도록 하였다. 실험 환경은 스마트폰 응용 프로그램을 개발하여 사용하였다. 본 논문에서 제안된 구조는 기존 방식에 비해 인식률의 성능이 표준어는 2%, 방언은 7% 정도 향상되었음을 보였다.

In this paper, we propose a technique for processing of speech recognition in korean words. Speech recognition is a technology that converts acoustic signals from sensors such as microphones into words or sentences. Most foreign languages have less difficulty in speech recognition. On the other hand, korean consists of vowels and bottom consonants, so it is inappropriate to use the letters obtained from the voice synthesis system. That improving the conventional structure speech recognition can the correct words recognition. In order to solve this problem, a new algorithm was added to the existing speech recognition structure to increase the speech recognition rate. Perform the preprocessing process of the word and then token the results. After combining the result processed in the Levenshtein distance algorithm and the hashing algorithm, the normalized words is output through the consonant comparison algorithm. The final result word is compared with the standardized table and output if it exists, registered in the table dose not exists. The experimental environment was developed by using a smartphone application. The proposed structure shows that the recognition rate is improved by 2% in standard language and 7% in dialect.

키워드

참고문헌

  1. Kyung Nim Lee, "Speech language processing technology, how far", National Institute of Korean Language New Language Life, Vol. 27, No. 4, pp. 99-116, 2017.
  2. https://www.gartner.com/en/newsroom/press-relleases/2019-01-09-gartner-predicts-25-percent-of-digital-workers-will-u
  3. G. Hinton et al., "Deep Neural Networks for Acoustic Modeling in Speech Recognition", The IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 18-27, 2012. https://doi.org/10.1109/MSP.2012.2197156
  4. Thomas F. Quatieri, "Discrete-Time Speech Signal Processing Principles and Practice", Prentice Hall, 2001.
  5. Dosik Moon, "Development and Evaluation of an Englich Speaking Task Using Smartphone and Text-to Speech", The Journal of The Institute of Internet, Broadcasting and Communication(JIIBC), Vol. 16, No. 5, pp. 13-20, 2016. Doi.org/10.7236/JIIBC.2016.16.5.13
  6. Hyeong-Joon Kwon, Tetsuo Kinoshita, "Novel Speech Web Architecture Based on Information Selection Agent", International Journal of Advanced Culture Technology(IJACT), Vol 1, No 1, pp. 11-14, 2013.
  7. Seung Joo Choi, Jong-Bae Kim, "Comparison Analysis of Speech Recognition Open API's Accuracy", Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and sociology, Vol. 7, No. 8, pp. 411-418, 2017.
  8. Hyun Shin Park, Sung woong Kim, Min-Ho Jin, Chan Dong Yoo, "Current trend of speech recognition base machine learnig", IEIE, pp. 18-27, 2014.
  9. Jong-Sub Lee, Sand-Yeob Oh, "Vocabulary Retrieve System using Improve Levenshtein Distance algorithm", The Journal of Digital Policy & Management, Vol. 11, No. 11, pp. 367-372, 2013. Doi.org/10.14400/JDPM.2013.11.11.367
  10. Eiichi Tanaka, Tamotsu Kasai, "Synchronization and Substitution Error-correcting codes for the Levenshtein Metric", IEEE Trans. Information Theory, Vol. IT-22, No. 2, pp. 156-162, 1976. https://doi.org/10.1109/TIT.1976.1055532
  11. Hee-Kyung Roh, Kang-Hee Lee, "A Basic Performance Evaluation of the Speech Recognition APP of Standard Language and Dialect using Goolgle, Naver, and Daum KAKAO APIs", Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, Vol. 7, No. 12, pp. 819-829, 2017.