• Title/Summary/Keyword: Prosody

Search Result 207, Processing Time 0.026 seconds

Building a Sentential Model for Automatic Prosody Evaluation

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.47-59
    • /
    • 2009
  • The purpose of this paper is to propose an automatic evaluation technique for the prosodic aspect of an English sentence uttered by Korean speakers learning English. The underlying hypothesis is that the consistency of the manual prosody scoring is reflected in an imaginary space of prosody evaluation model constructed out of the three physical properties of the prosody considered in this paper, namely: the fundamental frequency (F0) contour, the intensity contour, and the segmental durations. The evaluation proceeds first by building a prosody evaluation model for the sentence. For the creation of the model, utterances from native speakers of English and Korean learners for the target sentence are manually scored by either native teachers of English or Korean phoneticians in terms of their prosody. Multiple native utterances from the manual scoring are selected as the "model" native utterances against which all the other Korean learners' utterances as well as the model utterances themselves can be semi-automatically evaluated by comparison in terms of the three prosodic aspects [7]. Each learner utterance, when compared to the multiple model native utterances, produces multiple coordinates in a three-dimensional space of prosody evaluation, each axis of which corresponds to the three prosodic aspects. The 3D coordinates from all the comparisons form a prosody evaluation model for the particular sentence and the associated manual scores can display regions of particular scores. The model can then be used as a predictive model against which other Korean utterances of the target sentence can be evaluated. The model from a Korean phonetician appears to support the hypothesis.

  • PDF

Interaction between emotional content of word and prosody in the evaluation of emotional valence (정서의미 전달에 있어서 운율과 단어 정보의 상호작용.)

  • Choi, Moon-Gee;Nam, Ki-Chun
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.67-70
    • /
    • 2007
  • The present paper focuses on the interaction between lexical-semantic information and affective prosody. The previous studies showed that the influence of lexical-semantic information on the affective evaluation of the prosody was relatively clear, but the influence of emotional prosody on the word evaluation remains still ambiguous. In the present, we explore whether affective prosody influence on the evaluation of affective meaning of a word and vice versa, using more ecological stimulus (sentences) than simple words. We asked participants to evaluate the emotional valence of the sentences which were recorded with affective prosody (negative, neutral, and positive) in Experiment 1 and the emotional valence of their prosodies in Experiment 2. The results showed that the emotional valence of prosody can influence on the emotional evaluation of sentences and vice versa. Interestingly, the positive prosody is likely to be more responsible to this interaction.

  • PDF

Improvement of Prosody Transplantation Technology for English Prosody Education and Its Application (운율교육을 위한 운율이식기술 개선 방안 연구)

  • Yi, So-Pae
    • MALSORI
    • /
    • no.61
    • /
    • pp.49-62
    • /
    • 2007
  • This study focused on the improvement of prosody transplantation technology to be used for effective prosody education. Issues making the technology a less acceptable tool for prosody education were addressed. Instead of merely copying the target pitch onto a learner's utterances, the target pitch was resealed in semitone before the transplantation. In so doing, distortion of a signal was minimized and the transplanted utterance could have the quality of sound not different from the learner's utterances. Instead of manual transplantation, an automatic procedure was proposed to increase the reliability and the consistency of the outcome and enable real time processing. The perceptual performance of the automatic transplantation was evaluated by the perception experiment showing the automatic ransplantation was as good as the manual process.

  • PDF

The Contribution of Prosody to the Foreign Accent of Chinese Talkers' English Speech

  • Liu, Xing;Lee, Joo-Kyeong
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.59-73
    • /
    • 2012
  • This study attempts to investigate the contribution of prosody to the foreign accent in Chinese speakers' English production by examining the synthesized speech of crossing native and non-native talkers' prosody and segments. For the stimuli of the foreign accent ratings, we transplanted gender-matched native speakers' prosody onto non-native talkers' segments and vice versa, utilizing the TD-PSOLA algorithm. Eight English native listeners participated in judging foreign accent and comprehensibility of the transplanted stimuli. Results showed that the synthesized stimuli were perceived as stronger foreign accent regardless of speakers' proficiency when English speakers' prosody was crossed with Chinese speakers' segments. This suggests that segments contribute more than prosody to native listeners' evaluation of foreign accent. When transplanted with English speakers' segments, Chinese speakers' prosody showed a difference in duration rather than pitch between high and low proficiency such that stronger foreign accent was detected when low proficient Chinese speakers' duration was crossed with English speakers' segments. This indicated that prosody, more specifically duration, plays a role though the prosodic role is not overall as significant as segments. According to the post acoustic analysis, the temporal features contributing to making the duration parameter prominent as opposed to pitch were found out to be speaking rate, pause duration and pause frequency. Finally, foreign accent and comprehensibility showed no significant correlation such that native listeners had no difficulty listening to highly foreign accented speech.

A Study of an Independent Evaluation of Prosody and Segmentals: With Reference to the Difference in the Evaluation of English Pronunciation across Subject Groups (운율 및 분절음의 독립적 발음 평가 연구: 평가자 집단의 언어별 차이를 중심으로)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.91-98
    • /
    • 2013
  • This study investigates the difference in the evaluation of foreign-accentedness of English pronunciation across subject groups, evaluated accents, and compared components. This study independently evaluates the prosody and segmentals of the foreign-accented English sentences by pairwise difference rating. Using the prosody swapping technique, segmentals and prosody of the English sentences read by native speakers of American English (one male and one female) were combined with the corresponding segmentals and prosody of the English sentences read by male and female native speakers of Chinese, Japanese or Korean (one male and one female from each native language). These stimuli were evaluated by 4 different subject groups: native speakers of American English, Korean, Chinese, and Japanese. The results showed that the Japanese subject group scored higher in prosody difference than in segmental difference while the other groups scored the other way around. This study is significant in that the attitude toward the difference in segmentals and prosody of the foreign accents of English varies with the native language of the subject group. In other words, for native speakers of some languages, the difference in prosody could have a greater influence on the foreign-accentedness than the difference in segmentals, while for native speakers of other languages the other way around.

Chinese Prosody Generation Based on C-ToBI Representation for Text-to-Speech (음성합성을 위한 C-ToBI기반의 중국어 운율 경계와 F0 contour 생성)

  • Kim, Seung-Won;Zheng, Yu;Lee, Gary-Geunbae;Kim, Byeong-Chang
    • MALSORI
    • /
    • no.53
    • /
    • pp.75-92
    • /
    • 2005
  • Prosody Generation Based on C-ToBI Representation for Text-to-SpeechSeungwon Kim, Yu Zheng, Gary Geunbae Lee, Byeongchang KimProsody modeling is critical in developing text-to-speech (TTS) systems where speech synthesis is used to automatically generate natural speech. In this paper, we present a prosody generation architecture based on Chinese Tone and Break Index (C-ToBI) representation. ToBI is a multi-tier representation system based on linguistic knowledge to transcribe events in an utterance. The TTS system which adopts ToBI as an intermediate representation is known to exhibit higher flexibility, modularity and domain/task portability compared with the direct prosody generation TTS systems. However, the cost of corpus preparation is very expensive for practical-level performance because the ToBI labeled corpus has been manually constructed by many prosody experts and normally requires a large amount of data for accurate statistical prosody modeling. This paper proposes a new method which transcribes the C-ToBI labels automatically in Chinese speech. We model Chinese prosody generation as a classification problem and apply conditional Maximum Entropy (ME) classification to this problem. We empirically verify the usefulness of various natural language and phonology features to make well-integrated features for ME framework.

  • PDF

Discourse-level Prosody Produced by Korean Learners of English

  • Kim, Boram
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.67-77
    • /
    • 2014
  • This study investigated (1) whether Korean learners of English use discourse-level prosody in L2 production as native speakers of English do, and (2) whether discourse-level prosody is also found in the Korean language, as is evident in the prosody of native speakers of English. The study compared the production of the same 15 sentences in two types of reading materials, sentence-level and discourse-level. This study analyzed the onset pitch, sentence mean pitch and pause length to examine the paratone (intonational paragraph) realization in discourse-level speech. The results showed that in L2 discourse-level prosody, the Korean speakers were limited in displaying paratone and did not made significant difference between sentence-level and discourse-level prosody. On the other hand, in L1 discourse-level text, both English and Korean participants demonstrated paratone using pitch. However, there were differences in using prosodic cues between two groups. In using pauses, the ES group paused longer before both the orthographically marked and not marked topic sentences. The KS group paused longer only before the orthographically marked topic sentence in both L1 and L2 text reading. In the comparison of sentence-level and discourse-level prosody, the topic sentences were marked by different prosodic cues. English participants used higher sentence mean pitch, and the Korean participants used higher onset pitch.

A Study of an Independent Evaluation of Prosody and Segmentals: with Reference to the Difference in the Foreign Accent of Korean, Chinese, and Japanese Learners of English (운율 및 분절음의 독립적 발음 평가 연구: 한국인, 중국인, 일본인 영어 학습자의 액센트 차이를 중심으로)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.37-43
    • /
    • 2012
  • This study investigates an independent evaluation of prosody and segmentals with reference to the difference in the foreign accent of Korean, Chinese, and Japanese learners of English. For this study, a set of stimuli were made of English sentences read by male and female Korean, Chinese, and Japanese learners of English by prosody swapping technique. Two groups of American and Korean subjects evaluated the difference in the prosody and segmentals of the stimuli by pairwise difference rating. The results showed that there was no significant difference in the evaluation scores of prosody and segmentals across accents for either subject group. The results also showed that both subject groups indicated a greater score with segmentals than with prosody. The results of the present study are significant in that they are opposite to the claim of some previous studies that prosodic factors could have a greater influence on the foreign accent and intelligibility than segmentals.

Perception of Korean Prosody by Native Speakers of English and Native Speakers of Korean (영어 원어민과 한국어 원어민의 한국어운율 인식)

  • Yi, So-Pae
    • MALSORI
    • /
    • no.65
    • /
    • pp.1-11
    • /
    • 2008
  • This study explored the perception of transplanted Korean prosody by NE (Native speakers of English) and NK (Native speakers of Korean) listeners. The Korean utterances of various sentence types produced by NE and NK were employed to transplant the original Korean prosody contours to the Korean utterances read by NE. Then, other NE and NK were instructed to rate the transplanted prosodic components. Results showed that the interactions between the two rater groups with the three factors (e.g., transplantation types & rater groups, sentence types & rater groups, sentence length & rater groups) turned out to be meaningful. Both rater groups preferred the combined effect of transplanted prosodic components (e.g. DP, DPI) to that of individual transplantation (e.g. I, D, P). Compared to NK, NE were more sensitive to duration change than pitch change whereas NK showed equal preference to the both. In sentence types such as De, Ex, Im, and Ta, NE perceived higher similarity than NK.

  • PDF

PROSODY CONTROL BASED ON SYNTACTIC INFORMATION IN KOREAN TEXT-TO-SPEECH CONVERSION SYSTEM

  • Kim, Yeon-Jun;Oh, Yung-Hwan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.937-942
    • /
    • 1994
  • Text-to-Speech(TTS) conversion system can convert any words or sentences into speech. To synthesize the speech like human beings do, careful prosody control including intonation, duration, accent, and pause is required. It helps listeners to understand the speech clearly and makes the speech sound more natural. In this paper, a prosody control scheme which makes use of the information of the function word is proposed. Among many factors of prosody, intonation, duration, and pause are closely related to syntactic structure, and their relations have been formalized and embodied in TTS. To evaluate the synthesized speech with the proposed prosody control, one of the subjective evaluation methods-MOS(Mean Opinion Score) method has been used. Synthesized speech has been tested on 10 listeners and each listener scored the speech between 1 and 5. Through the evaluation experiments, it is observed that the proposed prosody control helps TTS system synthesize the more natural speech.

  • PDF