ETRI small-sized dialog style TTS system

ETRI 소용량 대화체 음성합성시스템

  • Kim, Jong-Jin (Speech/Language information processing Center, ETRI) ;
  • Kim, Jeong-Se (Speech/Language information processing Center, ETRI) ;
  • Kim, Sang-Hun (Speech/Language information processing Center, ETRI) ;
  • Park, Jun (Speech/Language information processing Center, ETRI) ;
  • Lee, Yun-Keun (Speech/Language information processing Center, ETRI) ;
  • Hahn, Min-Soo (Information and Communication University)
  • 김종진 (한국전자통신연구원 음성언어정보연구센터) ;
  • 김정세 (한국전자통신연구원 음성언어정보연구센터) ;
  • 김상훈 (한국전자통신연구원 음성언어정보연구센터) ;
  • 박준 (한국전자통신연구원 음성언어정보연구센터) ;
  • 이윤근 (한국전자통신연구원 음성언어정보연구센터) ;
  • 한민수 (한국정보통신대학교)
  • Published : 2007.05.18

Abstract

This study outlines a small-sized dialog style ETRI Korean TTS system which applies a HMM based speech synthesis techniques. In order to build the VoiceFont, dialog-style 500 sentences were used in training HMM. And the context information about phonemes, syllables, words, phrases and sentence were extracted fully automatically to build context-dependent HMM. In training the acoustic model, acoustic features such as Mel-cepstrums, logF0 and its delta, delta-delta were used. The size of the VoiceFont which was built through the training is 0.93Mb. The developed HMM-based TTS system were installed on the ARM720T processor which operates 60MHz clocks/second. To reduce computation time, the MLSA inverse filtering module is implemented with Assembly language. The speed of the fully implemented system is the 1.73 times faster than real time.

Keywords