DOI QR코드

DOI QR Code

Articulatory robotics

조음 로보틱스

  • Nam, Hosung (Department of English Language and Literature, Korea University, Haskins Laboratories)
  • Received : 2021.05.25
  • Accepted : 2021.06.16
  • Published : 2021.06.30

Abstract

Speech is a spatiotemporally coordinated structure of constriction actions at discrete articulators such as lips, tongue tip, tongue body, velum, and glottis. Like other human movements (e.g., reaching), each action as a linguistic task is completed by a synergy of involved basic elements (e.g., bone, muscle, neural system). This paper discusses how speech tasks are dynamically related to joints as one of the basic elements in terms of robotics of speech production. Further this introduction of robotics to speech sciences will hopefully deepen our understanding of how speech is produced and provide a solid foundation to developing a physical talking machine.

음성은 개별 조음 기관(입술, 혓끝, 혓몸, 연구개, 성문)에서 일어나는 협착 운동들의 시공간적 협응 구조라 할 수 있다. 다른 인간의 운동(예: 잡기)과 마찬가지로 각각의 협착 운동은 언어학적으로 의미 있는 task이며, 각 task는 그것과 관계된 기본 요소들의 시너지에 의해 수행된다. 본 연구는 이러한 음성 task가 어떻게 기본 요소들인 joint와 동역학적으로 연계될 수 있는지를 로보틱스의 관점에서 논의하고자 한다. 나아가 로보틱스의 기본 원리를 음성과학 분야에 소개함으로써 운동으로서의 음성이 어떻게 발화되는지에 대한 더 깊은 이해를 가능케 하고, 실제 인간의 조음을 모방한 말하는 기계를 구현하는 데 필요한 이론적 토대를 제공하고자 한다.

Keywords

Acknowledgement

이 논문은 2020년도 고려대학교 문과대학 단과대학 특별 연구비의 지원을 받아 수행되었습니다(관리번호: K2006521).

References

  1. Bernstein, N. (1967). The co-ordination and regulation of movements. Oxford, UK: Pergamon Press.
  2. Browman, C. P., & Goldstein, L. M. (1986). Towards an articulatory phonology. Phonology, 3(1), 219-252. https://doi.org/10.1017/S0952675700000658
  3. Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3-4), 155-180. https://doi.org/10.1159/000261913
  4. Browman, C. P., & Goldstein, L. (1995). Dynamics and articulatory phonology. In R. F. Port, & T. van Gelder (Eds.), Mind as motion: Explorations in the dynamics of cognition (pp. 175-193).
  5. Fowler, C., & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2-3), 171-195. https://doi.org/10.1177/002383099303600304
  6. Hwang, Y., Charles, S., & Lulich, S. M. (2019). Articulatory characteristics and variation of Korean laterals. Phonetics and Speech Sciences, 11(1), 19-27. https://doi.org/10.13064/KSSS.2019.11.1.019
  7. Kay, J. (2003). Teaching robotics from a computer science perspective. Journal of Computing Sciences in Colleges, 19(2), 329-336.
  8. Mermelstein, P. (1973). Articulatory model for the study of speech production. The Journal of the Acoustical Society of America, 53(4), 1070-1082. https://doi.org/10.1121/1.1913427
  9. Nam, H. (2011). Artificial neural network prediction of midsagittal pharynx shape from ultrasound images for English speech. Phonetics and Speech Sciences, 3(2), 23-28.
  10. Nam, H., & Saltzman, E. (2003, August). A competitive, coupled oscillator model of syllable structure. Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS) (pp. 2253-2256). Barcelona, Spain.
  11. Nam, H., Goldstein, L., Saltzman, E., & Byrd, D. (2004). TADA: An enhanced, portable task dynamics model in MATLAB. The Journal of the Acoustical Society of America, 115(5), 2430.
  12. Saltzman, E. L., & Munhall, K. G. (1989). A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1(4), 333-382. https://doi.org/10.1207/s15326969eco0104_2
  13. Shen, J., Pang, R., Weiss, R. J., Schuster, M., Jaitly, N., Yang, Z., Chen, Z., ... Wu, Y. (2018, April). Natural TTS synthesis by conditioning wavenet on MEL spectrogram predictions. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4779-4783). Calgary, AB, Canada.
  14. Son, M. (2020). Some articulatory reflexes observed in intervocalic consonantal sequences: Evidence from Korean place assimilation. Phonetics and Speech Sciences, 12(2), 17-27. https://doi.org/10.13064/KSSS.2020.12.2.017