Voice Driven Sound Sketch for Animation Authoring Tools

애니메이션 저작도구를 위한 음성 기반 음향 스케치

  • 권순일 (세종대학교 컴퓨터공학부 디지털콘텐츠)
  • Received : 2009.12.10
  • Accepted : 2010.02.11
  • Published : 2010.04.28


Authoring tools for sketching the motion of characters to be animated have been studied. However the natural interface for sound editing has not been sufficiently studied. In this paper, I present a novel method that sound sample is selected by speaking sound-imitation words(onomatopoeia). Experiment with the method based on statistical models, which is generally used for pattern recognition, showed up to 97% in the accuracy of recognition. In addition, to address the difficulty of data collection for newly enrolled sound samples, the GLR Test based on only one sample of each sound-imitation word showed almost the same accuracy as the previous method.


Supported by : 세종대학교


  1. M. Thorne, D. Burke, and M. Panne, "Motion Doodles: An Interface for Sketching Character Motion," ACM Transactions on Graphics, Vol. 23, pp.424-431, 2004.
  2. T. Nakano, M. Goto, J. Ogata, and Y. Hiraga, "Voice Drummer: A Music Notation Interface of Drum Sounds Using Voice Percussion Input," Proc. of ACM Symposium on User Interface Software and Technology (UIST), pp.49-50, 2005.
  3. Z. Wang and M. Panne, "Walk to here: A Voice Driven Animation System," Proc. of Eurographics/ ACM SIGGRAPH Symposium on Computer Animation, pp.16-20, 2006.
  4. O. Gillet and G. Richard, "Indexing and Querying Drum Loops Databases," Proc. of International workshop on Content Based on Multimedia and Indexing (CBMI'05), Riga, Latvia, 2005(6).
  5. K. Ishihara, Y. Tsubota, and H. G. Okuno, "Automatic Transformation of Environmental Sounds into Sound-Imitation Words Baed on Japanese Syllable Structure," Proc. of European Conference on Speech Communication and Technology, pp.3185-3188, 2003.
  6. K. Ishihara, T. Nakatani, T. Ogata, and H. G. Okuno, "Automatic Sound-Imitation Word Recognition from Environmental Sounds Focusing on Ambiguity Problem in Determining Phonemes," Lecture Note on Artificial Intelligence, Vol.3157, pp.909-918, 2004.
  7. T. C. Andringa and M. E. Niessen, "Real-world sound recognition: A recipe," Proc. of the 1st Workshop on Learning Semantics in Audio Signals(LSAS 2006), pp.106-118, 2006.
  8. R. Duda, D. Stork, and P. Hart, Pattern Classification, Wiley-Interscience Pub, 2/E, 2000.
  9. J. S. Baek, "A Generalized Likelihood Ratio Test in Outlier Detection," Korean Journal of Applied Statistics, Vol.4, pp.225-237, 1994.
  10. R. C. Davis, B. Colwell, and J. A. Landay, "K-sketch: a `kinetic' sketch pad for novice animators," Proc. of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, pp.413-422, 2008.
  11. M. Battermann, S. Heise, and J. Loviscach, "SonoSketch: Querying Sound Effect Databases through Painting," Proc. of 126th AES Convention, Paper Number 7794, 2009.

Cited by

  1. Modified Mel Frequency Cepstral Coefficient for Korean Children's Speech Recognition vol.13, pp.3, 2013,