Algorithm for Concatenating Multiple Phonemic Units for Small Size Korean TTS Using RE-PSOLA Method

  • Bak, Il-Suh (SASPL, School of Mechatronics, Changwon National University) ;
  • Jo, Cheol-Woo (SASPL, School of Mechatronics, Changwon National University)
  • 발행 : 2003.03.01

초록

In this paper an algorithm to reduce the size of Text-to-Speech database is proposed. The algorithm is based on the characteristics of Korean phonemic units. From the initial database, a reduced phoneme unit set is induced by articulatory similarity of concatenating phonemes. Speech data is read by one female announcer for 1000 phonetically balanced sentences. All the recorded speech is then segmented by phoneticians. Total size of the original speech data is about 640 MB including laryngograph signal. To synthesize wave, RE-PSOLA (Residual-Excited Pitch Synchronous Overlap and Add Method) was used. The voice quality of synthesized speech was compared with original speech in terms of spectrographic informations and objective tests. The quality of the synthesized speech is not much degraded when the size of synthesis DB was reduced from 320 MB to 82 MB.

키워드