• Title/Summary/Keyword: speech duration

Search Result 469, Processing Time 0.023 seconds

A Study on the Durational Characteristics of Korean Distant-Talking Speech (한국어 원거리 음성의 지속시간 연구)

  • Kim, Sun-Hee
    • MALSORI
    • /
    • no.54
    • /
    • pp.1-14
    • /
    • 2005
  • This paper presents durational characteristics of Korean distant-talking speech using speech data, which consist of 500 distant-talking utterances and 500 normal utterances of 10 speakers (5 males and 5 females). Each file was segmented and labeled manually and the duration of each segment and each word was extracted. Using a statistical method, the durational change of distant-talking speech in comparison with normal speech was analyzed. The results show that the duration of words with distant-talking speech is increased in comparison with normal style, and that the average unvoiced consonantal duration is reduced while the average vocalic duration is increased. Female speakers show a stronger tendency towards lengthening the duration in distant-talking speech. Finally, this study also shows that the speakers of distant-talking speech could be classified according to their different duration rate.

  • PDF

Segment and Word Duration Produced by Preschool Children (학령전기 아동의 분절음 및 단어 길이)

  • Kang, Eunyeong
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.8 no.4
    • /
    • pp.291-305
    • /
    • 2020
  • Purpose : The duration of speech segments reflects children's speech motor development. The purpose of this study was to determine whether segmental sound and word duration varies by age among preschool children. Methods : A total of 60 children aged 4~5 years participated in this study. Participants took the picture-naming test to produce single-word speech data. The duration of the consonant at the initial position of the word and the final position of the word, the voice onset time of plosive, the duration of the vowel following the initial consonant, and the duration of the word were measured. Results : As age increased, the duration of the initial consonant, the duration of the word, and the voice onset time decreased significantly. The main effects of age, manner of articulation, and place of articulation on the duration of the initial consonant were significant. The duration of consonants in the nasal sound and plosives and the duration of bilabial and alveolar sound differed significantly between groups. The main effects of age and vocal type on voice onset time were significant. The main effect of age on the duration of the consonant in the final position of word and on the duration of the vowel were not statistically significant. Conclusion : The results of this study showed that the duration of segmental sound and the word were associated with speech development between 4 and 5 years old. Accordingly, duration of the segmental sound and the word may serve as an acoustic cue as they reflect speech development and speech motor control maturity.

Perceptual Evaluation of Duration Models in Spoken Korean

  • Chung, Hyun-Song
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.207-215
    • /
    • 2002
  • Perceptual evaluation of duration models of spoken Korean was carried out based on the Classification and Regression Tree (CART) model for text-to-speech conversion. A reference set of durations was produced by a commercial text-to-speech synthesis system for comparison. The duration model which was built in the previous research (Chung & Huckvale, 2001) was applied to a Korean language speech synthesis diphone database, 'Hanmal (HN 1.0)'. The synthetic speech produced by the CART duration model was preferred in the subjective preference test by a small margin and the synthetic speech from the commercial system was superior in the clarity test. In the course of preparing the experiment, a labeled database of spoken Korean with 670 sentences was constructed. As a result of the experiment, a trained duration model for speech synthesis was obtained. The 'Hanmal' diphone database for Korean speech synthesis was also developed as a by-product of the perceptual evaluation.

  • PDF

Aerodynamic Characteristics of Whispered and Normal Speech during Reading Paragraph Tasks (문단낭독 시 속삭임 발화와 정상 발화의 공기역학적 특성)

  • Pyo, Hwayoung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.57-62
    • /
    • 2014
  • The present study was performed to investigate and discuss the aerodynamic characteristics of whispered and normal speech during reading paragraph tasks. 39 normal females(18-23 yrs.) read 'Autumn' paragraph with whispered and normal phonation. Their readings were recorded and analyzed by 'Running Speech' in Phonatory Aerodynamic System(PAS) instrument. As results, during whispered speech, the total duration was longer and the numbers of inspiration were more frequently shown than normal speech. The Peak expiratory and inspiratory rate were higher in normal speech, but the expiratory and inspiratory volume were higher in whispered speech. By correlation analysis, both whispered and normal speech showed significantly high correlation between total duration and expiratory/inspiratory airflow duration; numbers of inspiration and inspiratory airflow duration; expiratory and inspiratory volume. These results show that whispered speech needs more respiratory effort but shows poorer aerodynamic efficacy during phonation than normal speech.

Control of Duration Model Parameters in HMM-based Korean Speech Synthesis (HMM 기반의 한국어 음성합성에서 지속시간 모델 파라미터 제어)

  • Kim, Il-Hwan;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.97-105
    • /
    • 2008
  • Nowadays an HMM-based text-to-speech system (HTS) has been very widely studied because it needs less memory and low computation complexity and is suitable for embedded systems in comparison with a corpus-based unit concatenation text-to-speech one. It also has the advantage that voice characteristics and the speaking rate of the synthetic speech can be converted easily by modifying HMM parameters appropriately. We implemented an HMM-based Korean text-to-speech system using a small size Korean speech DB and proposes a method to increase the naturalness of the synthetic speech by controlling duration model parameters in the HMM-based Korean text-to speech system. We performed a paired comparison test to verify that theses techniques are effective. The test result with the preference scores of 73.8% has shown the improvement of the naturalness of the synthetic speech through controlling the duration model parameters.

  • PDF

An Acoustic Analysis of Speech in Patients with Nonfluent Aphasia (비 유창성 실어증 환자 말소리의 음향학적 분석)

  • Kim, Hyun-Gi;Kang, Eun-Young;Kim, Yun-Hee
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.87-97
    • /
    • 2002
  • The purpose of this study is to analyze the speech duration in Korean-speaking aphasics. Five patients with nonfluent aphasia (2 with traumatic brain injury and 3 with strokes) and five normal adults participated in this experiment. The mean age in patients with nonfluent aphasia was $45.8\pm2.3$ years and $47.4\pm2.3$ years for the normal adults. The Computerized Speech Lab was used to evaluate the acoustic characteristics of the subjects. Voice onset time, vowel duration, total duration, hold and consonant duration were evaluated for the monosyllabic and the polysyllabic words. The patients with nonfluent aphasia did not show the voicing bar on hold area, however, it was seen in the normal persons in the intervocalic position. Explosion duration of glottalized stops in the intervocalic position was significantly prolonged in nonfluent aphasics in comparison with the normal persons. This suggestes that the laryngeal adjustment is disturbed in these patients. Consonant duration, vowel duration, and total duration of the polysyllabic words were significantly longer in the patients with nonfluent aphasia than those of the normal persons. These results demonstrate the disturbances in controlling articulatory muscles during sound production in patients with nonfluent aphasia. The objective and quantitative analysis based on the acoustic characteristics of nonfluent aphasics, will be very useful in therapeutic planning and on the the effects of speech therapy.

  • PDF

A Study on the Durational Characteristics of Korean Lombard Speech (한국어 롬바드 음성의 지속시간 연구)

  • Kim, Sun-Hee
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.21-24
    • /
    • 2005
  • This paper presents durational characteristics of Korean Lombard speech using data, which consist of 500 Lombard utterances and 500 normal utterances of 10 speakers (5 males and 5 females). Each file was segmented and labeled manually and the duration of each segment and each word was extracted. The durational change of Lombard effect in comparison with normal speech was analyzed using a statistical method. The results show that the duration of words with Lombard effect is increased in comparison with normal style, and that the average unvoiced consonantal duration is reduced while the average vocalic duration is increased. Female speakers show a stronger tendency towards lengthening the duration in Lombard speech, but without statistical significance. Finally, this study also shows that the speakers of Lombard speech could be classified according to their different duration rate.

  • PDF

Durational Interaction of Stops and Vowels in English and Korean Child-Directed Speech

  • Choi, Han-Sook
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.61-70
    • /
    • 2012
  • The current study observes the durational interaction of tautosyllabic consonants and vowels in the word-initial position of English and Korean child-directed speech (CDS). The effect of phonological laryngeal contrasts in stops on the following vowel duration, and the effect of the intrinsic vowel duration on the release duration of preceding stops in addition to the acoustic realization of the contrastive segments are explored in different prosodic contexts - phrase-initial/medial, focal accented/non-focused - in a marked speech style of CDS. A trade-off relationship between Voice Onset Time (VOT), as consonant release duration, and voicing phonation time, as vowel duration, reported from adult-to-adult speech, and patterns of durational variability are investigated in CDS of two languages with different linguistic rhythms, under systematically controlled prosodic contexts. Speech data were collected from four native English mothers and four native Korean mothers who were talking to their one-word staged infants. In addition to the acoustic measurements, the transformed delta measure is employed as a variability index of individual tokens. Results confirm the durational correlation between prevocalic consonants and following vowels. The interaction is revealed in a compensatory pattern such as longer VOTs followed by shorter vowel durations in both languages. An asymmetry is found in CV interaction in that the effect of consonant on vowel duration is greater than the VOT differences induced by the vowel. Prosodic effects are found such that the acoustic difference is enhanced between the contrastive segments under focal accent, supporting the paradigmatic strengthening effect. Positional variation, however, does not show any systematic effects on the variations of the measured acoustic quantities. Overall vowel duration and syllable duration are longer in English tokens but involve less variability across the prosodic variations. The constancy of syllable duration, therefore, is not found to be more strongly sustained in Korean CDS. The stylistic variation is discussed in relation to the listener under linguistic development in CDS.

Performance Comparison and Duration Model Improvement of Speaker Adaptation Methods in HMM-based Korean Speech Synthesis (HMM 기반 한국어 음성합성에서의 화자적응 방식 성능비교 및 지속시간 모델 개선)

  • Lee, Hea-Min;Kim, Hyung-Soon
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.111-117
    • /
    • 2012
  • In this paper, we compare the performance of several speaker adaptation methods for a HMM-based Korean speech synthesis system with small amounts of adaptation data. According to objective and subjective evaluations, a hybrid method of constrained structural maximum a posteriori linear regression (CSMAPLR) and maximum a posteriori (MAP) adaptation shows better performance than other methods, when only five minutes of adaptation data are available for the target speaker. During the objective evaluation, we find that the duration models are insufficiently adapted to the target speaker as the spectral envelope and pitch models. To alleviate the problem, we propose the duration rectification method and the duration interpolation method. Both the objective and subjective evaluations reveal that the incorporation of the proposed two methods into the conventional speaker adaptation method is effective in improving the performance of the duration model adaptation.

Durational aspects of Korean nasal geminates

  • Oh, Eunhae
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.19-25
    • /
    • 2017
  • The current study focused on the production of geminate nasal consonants across different word boundary types in Korean as a function of speech style to investigate whether temporal properties are preserved across varying speaking rates. Assimilated geminates in Korean, known as true geminates, are produced with distinctively longer consonant duration compared to singletons. Despite a large body of literature for geminates across different languages, geminates in Korean have been relatively less investigated with respect to the durational patterns in relative terms and temporal variabilities. In this study, singletons, word-internal geminates and word-boundary (fake) geminates produced by ten native Seoul Korean speakers were compared in terms of absolute consonant closure duration, preceding vowel duration, the relative ratios (consonant-to-preceding vowel duration) as well as the temporal variabilities in speech production. The results showed that word-internal geminates were produced with longer consonant duration and greater temporal variabilities than singletons and word-boundary geminates in absolute duration, indicating relatively greater flexibility in timing. However, only word-internal geminates were produced with distinctively longer consonant duration with significantly lower variability in relative duration regardless of speech styles. The results provide some insight into the representation of temporal information in the production of Korean geminate consonants.