• Title/Summary/Keyword: Formant Trajectory

Search Result 7, Processing Time 0.021 seconds

Pitch and Formant Trajectories of English Vowels by American Males with Different Speaking Styles (발화방식에 따른 미국인 남성 영어모음의 피치와 포먼트 궤적)

  • Yang, Byung-Gon
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.21-28
    • /
    • 2012
  • Many previous studies reported acoustic parameters of English vowels produced by a clear speaking style. In everyday usage, we actually produce speech sounds with various speaking styles. Different styles may yield different acoustic measurements. This study attempts to examine pitch and formant trajectories of eleven English vowels produced by nine American males in order to understand acoustic variations depending on clear and conversational speaking styles. The author used Praat to obtain trajectories systematically at seven equidistant time points over the vowel segment while checking measurement validity. Results showed that pitch trajectories indicated distinct patterns depending on four speaking styles. Generally, higher pitch values were observed in the higher vowels and the pitch was higher in the clear speaking styles than that in the conversational styles. The same trend was observed in the three formant trajectories of front vowels and the first formant trajectories of back vowels. The second and third trajectories of back vowels revealed an opposite or inconsistent trend, which might be attributable to the coarticulation of the following consonant or lip rounding gestures. The author made a tentative conclusion that people tend to produce vowels to enhance pitch and formant differences to transmit their information clearly. Further perceptual studies on synthesized vowels with varying pitch and formant values are desirable to address the conclusion.

Performance Evaluation of Cochlear Implants Speech Processing Strategy Using Neural Spike Train Decoding (Neural Spike Train Decoding에 기반한 인공와우 어음처리방식 성능평가)

  • Kim, Doo-Hee;Kim, Jin-Ho;Kim, Kyung-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.2
    • /
    • pp.271-279
    • /
    • 2007
  • We suggest a novel method for the evaluation of cochlear implant (CI) speech processing strategy based on neural spike train decoding. From formant trajectories of input speech and auditory nerve responses responding to the electrical pulse trains generated from a specific CI speech processing strategy, optimal linear decoding filter was obtained, and used to estimate formant trajectory of incoming speech. Performance of a specific strategy is evaluated by comparing true and estimated formant trajectories. We compared a newly-developed strategy rooted from a closer mimicking of auditory periphery using nonlinear time-varying filter, with a conventional linear-filter-based strategy. It was shown that the formant trajectories could be estimated more exactly in the case of the nonlinear time-varying strategy. The superiority was more prominent when background noise level is high, and the spectral characteristic of the background noise was close to that of speech signals. This confirms the superiority observed from other evaluation methods, such as acoustic simulation and spectral analysis.

Speech recognition rates and acoustic analyses of English vowels produced by Korean students

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.11-17
    • /
    • 2022
  • English vowels play an important role in verbal communication. However, Korean students tend to experience difficulty pronouncing a certain set of vowels despite extensive education in English. The aim of this study is to apply speech recognition software to evaluate Korean students' pronunciation of English vowels in minimal pair words and then to examine acoustic characteristics of the pairs in order to check their pronunciation problems. Thirty female Korean college students participated in the recording. Speech recognition rates were obtained to examine which English vowels were correctly pronounced. To compare and verify the recognition results, such acoustic analyses as the first and second formant trajectories and durations were also collected using Praat. The results showed an overall recognition rate of 54.7%. Some students incorrectly switched the tense and lax counterparts and produced the same vowel sounds for qualitatively different English vowels. From the acoustic analyses of the vowel formant trajectories, some of these vowel pairs were almost overlapped or exhibited slight acoustic differences at the majority of the measurement points. On the other hand, statistical analyses on the first formant trajectories of the three vowel pairs revealed significant differences throughout the measurement points, a finding that requires further investigation. Durational comparisons revealed a consistent pattern among the vowel pairs. The author concludes that speech recognition and analysis software can be useful to diagnose pronunciation problems of English-language learners.

Vowel Formant Trajectory Patterns for Shared Vowels of American English and Korean

  • Chung, Hyun-Ju;Kong, Eun-Jong;Weismer, Gary
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.67-74
    • /
    • 2010
  • The purpose of this study was to explore the cross-linguistic difference in the spectral movement pattern of American English and Korean vowels. Eight American vowels /a/, /e/, /$\varepsilon$/, /i/, /I/, /o/, /u/, and /$\mho$/, and five Korean vowels, /a/, /e/, /i/, /o/ and /u/ in a fricative-vowel environment produced by adult speakers of each language were analyzed. The spectral movement patterns of the first two formant frequency values were measured and analyzed. The results showed that Korean vowels had minimal spectral movement, both in F1 and F2 values, as compared to American English vowels. Moreover, no consistent direction of movement was found in the three corner Korean vowels, while American English vowels showed consistent direction of movement for each vowel of the same phonemic category.

  • PDF

Speech Synthesis using Diphone Clustering and Improved Spectral Smoothing (다이폰 군집화와 개선된 스펙트럼 완만화에 의한 음성합성)

  • Jang, Hyo-Jong;Kim, Kwan-Jung;Kim, Gye-Young;Choi, Hyung-Il
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.665-672
    • /
    • 2003
  • This paper describes a speech synthesis technique by concatenating unit phoneme. At that time, a major problem is that discontinuity is happened from connection part between unit phonemes, especially from connection part between unit phonemes recorded by different persons. To solve the problem, this paper uses clustered diphone, and proposes a spectral smoothing technique, not only using formant trajectory and distribution characteristic of spectrum but also reflecting human's acoustic characteristic. That is, the proposed technique performs unit phoneme clustering using distribution characteristic of spectrum at connection part between unit phonemes and decides a quantity and a scope for the smoothing by considering human's acoustic characteristic at the connection part of unit phonemes, and then performs the spectral smoothing using weights calculated along a time axes at the border of two diphones. The proposed technique removes the discontinuity and minimizes the distortion which can be occurred by spectrum smoothing. For the purpose of the performance evaluation, we test on five hundred diphones which are extracted from twenty sentences recorded by five persons, and show the experimental results.

Improvement of Synthetic Speech Quality using a New Spectral Smoothing Technique (새로운 스펙트럼 완만화에 의한 합성 음질 개선)

  • 장효종;최형일
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.11
    • /
    • pp.1037-1043
    • /
    • 2003
  • This paper describes a speech synthesis technique using a diphone as an unit phoneme. Speech synthesis is basically accomplished by concatenating unit phonemes, and it's major problem is discontinuity at the connection part between unit phonemes. To solve this problem, this paper proposes a new spectral smoothing technique which reflects not only formant trajectories but also distribution characteristics of spectrum and human's acoustic characteristics. That is, the proposed technique decides the quantity and extent of smoothing by considering human's acoustic characteristics at the connection part of unit phonemes, and then performs spectral smoothing using weights calculated along a time axis at the border of two diphones. The proposed technique reduces the discontinuity and minimizes the distortion which is caused by spectral smoothing. For the purpose of performance evaluation, we tested on five hundred diphones which are extracted from twenty sentences using ETRI Voice DB samples and individually self-recorded samples.

Classification of Diphthongs using Acoustic Phonetic Parameters (음향음성학 파라메터를 이용한 이중모음의 분류)

  • Lee, Suk-Myung;Choi, Jeung-Yoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.167-173
    • /
    • 2013
  • This work examines classification of diphthongs, as part of a distinctive feature-based speech recognition system. Acoustic measurements related to the vocal tract and the voice source are examined, and analysis of variance (ANOVA) results show that vowel duration, energy trajectory, and formant variation are significant. A balanced error rate of 17.8% is obtained for 2-way diphthong classification on the TIMIT database, and error rates of 32.9%, 29.9%, and 20.2% are obtained for /aw/, /ay/, and /oy/, for 4-way classification, respectively. Adding the acoustic features to widely used Mel-frequency cepstral coefficients also improves classification.