The Journal of the Korea Contents Association (한국콘텐츠학회논문지)
- Volume 11 Issue 5
- /
- Pages.138-147
- /
- 2011
- /
- 1598-4877(pISSN)
- /
- 2508-6723(eISSN)
DOI QR Code
Comparison of Adult and Child's Speech Recognition of Korean
한국어에서의 성인과 유아의 음성 인식 비교
- Received : 2011.01.14
- Accepted : 2011.03.31
- Published : 2011.05.28
Abstract
While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.
Keywords
Speech Recognition;Comparison of Adult and Children's Speech; HMM;VTLN
File
References
- 이용주, 김봉완, 김영일, 최대림, "한국의 공동이용을 위한 음성언어자원의 구축 및 보급현황", 한국어정보학회, 제10권, 제1호, pp.81-85, 2008.
- D. Giuliani and M. Gerosa, "Investigating recognition of children's speech," Proc. ICASSP, pp.137-140, 2003.
- S. Narayanan and A. Potamianos, "Creating conversational interfaces for children," IEEE Trans. on Speech and Audio Processing, Vol.10, No.2, pp.65-78, 2002. https://doi.org/10.1109/89.985544
- M. Gerosa, D. Giuliani, and F. Brugnara, "Acoustic variability and automatic recognition of children's speech," Speech Communication 49, pp.847-860, 2007. https://doi.org/10.1016/j.specom.2007.01.002
- D. Elenius and M. Blomberg, "Comparing speech recognition for adults and children," in Proceedings of FONETIK, Stockholm, Sweden, 2004.
- H. Wakita, "Normalization of vowels by vocal tract length and its application to vowel identification," IEEE Trans. on Acoustic. Speech and Signal Processing, 25, pp.183-192, 1977. https://doi.org/10.1109/TASSP.1977.1162929
- S. ohgren, "Experiment with adaptation and vocal tract length normalization at automatic speech recognition of children's Speech," KTH, Stockholm, Sweden, 2007.
- J. E. Huber, E. T. Stathopoulos, G. M. Curione, T. A. Ash and K. Johnson, "Formants of children women and men: The effect of vocal intensity variation," Journal of the acoustical society of america. Vol.106, No.3, pp.1532-1542, 1999. https://doi.org/10.1121/1.427150
- A. Potamianos and S. Narayanan, "A review of the acoustic and linguistic properties of children's speech," in Proceedings of IEEE Multimedia Signal Processing Workshop, 2007. https://doi.org/10.1109/MMSP.2007.4412809
- 장보경, 이연규, "유아의 연령과 성별에 따른 언어발달과 사회정서 발달의 차이", 한국Montessori 교육학회, Vol.14, No.2, pp.61-77, 2009.
- G. Potamianos and S. Narayanan, "Robust recognition of children speech," IEEE Transaction on Speech and Audio Processing 11, pp.603-616, 2003. https://doi.org/10.1109/TSA.2003.818026
- A. Hagen, B. Pellom, and R. Cole, "Highly accurate children's speech recognition for interactive reading tutors using subword units," Speech Communication. Vol.49, No.12, pp.861-873, 2007. https://doi.org/10.1016/j.specom.2007.05.004
- R. D. Kent and L. L. Forner, "Speech segment durations in sentence recitations by children and adults," Journal of Phonetics, 8, pp.157-168, 1980.
- S. Lee, A. Potamianos, and S. Naraynan, "Acoustics of children's speech: Developmental changes of temporal and spectral parameters," Journal of the Acoustical Society of America, pp.1455-1468, 1999.
- H. John and H. Wendy, "Speech synthesis and recognition," Taylor & Francis, 2nd edition, 2001.
- J. Nicholas and A. Geers, "Effects of early auditory experience on the spoken language of deaf children at 3 years of age," Ear Hear, 27, pp.286-296, 2006. https://doi.org/10.1097/01.aud.0000215973.76912.c6