Detection and Synthesis of Transition Parts of The Speech Signal

  • Published : 2008.03.31

Abstract

For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

Keywords

References

  1. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Upper Saddle River, NJ: Prentice Hall, 1978
  2. T. F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practices. Upper Saddle River, NJ: Prentice Hall, 2002
  3. DVSI, APCO project 25: Vocoder Description, Version 1.3. July, 1993
  4. Y. D. Cho, M. Y. Kim, and S. R. Kim, 'A spectrally mixed excitation (SMX) vocoder with robust parameters determination,' in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 601-604, Seattle, WA, USA, 1998
  5. C. Li and V. Cuperman, 'Enhanced Harmonic Coding of Speech with Frequency Domain Transition Modeling,' in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 581-584, Seattle, WA, USA, 1998
  6. W. B. Kleijn and J. Haagen, Speech Coding and Synthesis. Amsterdam, The Netherlands: Elsevier, 1995
  7. T. Unno, T. P. barnwell III, and K. Truong, 'An Improved Mixed Excitation Linear Prediction (MELP) Coder,' in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. 245-248, Phoenix, Arizona, USA, 1999
  8. D. S. Kim and M. Y. Kim, 'On the perceptual weighting function for phase quantization of speech,' in Proc. IEEE Workshop on Speech Coding, pp.62-64, Finland, 2000