• Title/Summary/Keyword: natural speech

Search Result 313, Processing Time 0.034 seconds

Differences in High Pitch Accents between News Speech and Natural Speech (영어 뉴스와 자연발화에 나타나는 고성조 피치액센트의 차이점)

  • Choi, Yun-Hui;Lee, Joo-Kyeong
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.17-28
    • /
    • 2005
  • This paper argues that news speech entails a distinct intonational pattern from natural speech, effectively reflecting that it primarily focuses on providing new information. We conducted a phonetic experiment to compare the tonal contours between news speech and natural speech, examining the distributions of pitch accents and the overall pitch ranges. We utilized 70 American Press (AP) radio news utterances and 70 natural utterances extracted from TV dramas. Results show that news speech involves 3.38 H*'s (including L+H* and !H*) within an intonational phrase (IP) or intermediate phrase (ip) whereas natural speech, 1.8 in average. The number of IP/ip's per sentence is 3 in news speech, which is shown in the highest rate of 32.07% of the news speech, but it is merely 1, taking up the highest 41.42% in natural speech. Next, declination tends to be prevented in news speech, and the pitch range is much greater in news speech than in natural speech. Finally, a secondary stress syllable is comparatively frequently given a pitch accent in news speech, explicitly distinct from natural speech. These results can be interpreted as stating that news has the particular purpose of providing new information; every content word tends to be given a H* or its related pitch accent like L+H* or !H* because news speech assumes that every word conveys new information. This definitely brings about more IP/ip's per sentence due to a human physiological constraint; that is, more H*'s will cause more respiratory breaks. Also, greater pitch ranges and pitch accents imposed on secondary stress may be attributed to exaggerating new information.

  • PDF

Speech Recognition Interface in the Communication Environment (통신환경에서 음성인식 인터페이스)

  • Han, Tai-Kun;Kim, Jong-Keun;Lee, Dong-Wook
    • Proceedings of the KIEE Conference
    • /
    • 2001.07d
    • /
    • pp.2610-2612
    • /
    • 2001
  • This study examines the recognition of the user's sound command based on speech recognition and natural language processing, and develops the natural language interface agent which can analyze the recognized command. The natural language interface agent consists of speech recognizer and semantic interpreter. Speech recognizer understands speech command and transforms the command into character strings. Semantic interpreter analyzes the character strings and creates the commands and questions to be transferred into the application program. We also consider the problems, related to the speech recognizer and the semantic interpreter, such as the ambiguity of natural language and the ambiguity and the errors from speech recognizer. This kind of natural language interface agent can be applied to the telephony environment involving all kind of communication media such as telephone, fax, e-mail, and so on.

  • PDF

A Study of Locke's Concept of Freedom of Speech as Proprietorship (소유권적 언론자유에 대한 일고찰 : 로크의 사회계약론을 중심으로)

  • Moon, Jong-Dae
    • Korean journal of communication and information
    • /
    • v.17
    • /
    • pp.7-36
    • /
    • 2001
  • This thesis discussed the nature of freedom of speech with emphasis on Locke's theory of social contract. First, I examined the nature of freedom of speech induced from Locke's social contract, and argued that the nature of Locke's freedom of speech exists on the self-ownership of humans. Secondly, I studied how Locke's right of self-ownership was related to the right of freedom of speech and how it is realized in civil society. I could analyze how freedom of speech was actualized with un-equality in the social relations. Thirdly, I investigated how locke's possessive freedom of speech was materialized in the market society. I tried to find out the nature of freedom of speech actualization in the capitalist market society. Finally, 1 studied to what extent the state of Locke could intervene the freedom of speech and reconsidered the meaning of locke's limit of natural risht in modern society. Conclusively, Locke's notion of Natural Right and Law of Nature have greatly influenced contemporary idea of free speech. His idea helps understand the position of liberal democratic speech. It also shows well the relation of freedom of speech and Natural Right and has helped us understand freedom of speech in terms of the position of the right of property.

  • PDF

PROSODY CONTROL BASED ON SYNTACTIC INFORMATION IN KOREAN TEXT-TO-SPEECH CONVERSION SYSTEM

  • Kim, Yeon-Jun;Oh, Yung-Hwan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.937-942
    • /
    • 1994
  • Text-to-Speech(TTS) conversion system can convert any words or sentences into speech. To synthesize the speech like human beings do, careful prosody control including intonation, duration, accent, and pause is required. It helps listeners to understand the speech clearly and makes the speech sound more natural. In this paper, a prosody control scheme which makes use of the information of the function word is proposed. Among many factors of prosody, intonation, duration, and pause are closely related to syntactic structure, and their relations have been formalized and embodied in TTS. To evaluate the synthesized speech with the proposed prosody control, one of the subjective evaluation methods-MOS(Mean Opinion Score) method has been used. Synthesized speech has been tested on 10 listeners and each listener scored the speech between 1 and 5. Through the evaluation experiments, it is observed that the proposed prosody control helps TTS system synthesize the more natural speech.

  • PDF

Formant Locus Overlapping Method to Enhance Naturalness of Synthetic Speech (합성음의 자연도 향상을 위한 포먼트 궤적 중첩 방법)

  • 안승권;성굉모
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.10
    • /
    • pp.755-760
    • /
    • 1991
  • In this paper, we propose a new formant locus overlapping method which can effectively enhance a naturalness of synthetic speech produced by ddemisyllable based Korean text-to-speech system. At first, Korean demisyllables are divided into several number of segments which have linear formant transition characteristics. Then, database, which is composed of start point and length of each formant segments, is provided. When we synthesize speech with these demisyllable database, we concatenate each formant locus by using a proposed overlapping method which can closely simulate haman articulation mechanism. We have implemented a Korean text-to-speech system by using this method and proved that the formant loci of synthetic speech are similar to those of the natural speech. Finally, we could illustrate that the resulting spectrograms of proposed method are more similar to natural speech than those of conventional method.

  • PDF

Information Dimensions of Speech Phonemes

  • Lee, Chang-Young
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.148-155
    • /
    • 1998
  • As an application of dimensional analysis in the theory of chaos and fractals, we studied and estimated the information dimension for various phonemes. By constructing phase-space vectors from the time-series speech signals, we calculated the natural measure and the Shannon's information from the trajectories. The information dimension was finally obtained as the slope of the plot of the information versus space division order. The information dimension showed that it is so sensitive to the waveform and time delay. By averaging over frames for various phonemes, we found the information dimension ranges from 1.2 to 1.4.

  • PDF

A Study on Vocabulary-Independent Continuous Speech Recognition System for Intelligent Home Network System (지능형 홈네트워크 시스템을 위한 가변어휘 연속음성인식시스템에 관한 연구)

  • Lee, Ho-Woong;Jeong, Hee-Suk
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.7 no.2
    • /
    • pp.37-42
    • /
    • 2008
  • In this paper, the vocabulary-independent continuous speech recognition system for speech control of intelligent home-network is presented. This study suggests a conversational scenario of continuous natural vocabulary based upon keywords for recognition on natural speech command, and a way of optimizing the recognition system by constructing a recognition system and database based upon keywords.

  • PDF

SPEECH SYNTHESIS USING LARGE SPEECH DATA-BASE

  • Lee, Kyu-Keon;Mochida, Takemi;Sakurai, Naohiro;Shirai, Katasuhiko
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.949-956
    • /
    • 1994
  • In this paper, we introduce a new speech synthesis method for Japanese and Korean arbitrary sentences using the natural speech data-base. Also, application of this method to a CAI system is discussed. In our synthesis method, a basic sentence and basic accent-phrases are selected from the data-base against a target sentence. Factors for those selections are phrase dependency structure (separation degree), number of morae, type of accent and phonemic labels. The target pitch pattern and phonemic parameter series are generated using those selected basic units. As the pitch pattern is generated using patterns which are directly extracted form real speech, it is expected to be more natural than any other pattern which is estimated by any model. Until now, we have examined this method on Japanese sentence speech and affirmed that the synthetic sound preserves human-like features fairly well. Now we extend this method to Korean sentence speech synthesis. Further more, we are trying to apply this synthesis unit to a CAI system.

  • PDF

Speech Synthesis Algorithm Using Mixed Phase Information for TTS Systems (혼합 위상 정보를 이용한 TTS 합성음 생성 알고리즘)

  • Kwon, Chul-Hong;Lee, Min-Kyu
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.35-43
    • /
    • 2001
  • New speech synthesis algorithms capable of flexible prosody (especially F0) modification are desired for a high quality TTS system. TD-PSOLA is the most popular synthesis algorithm. The algorithm shows very high quality when F0 modification is limited. However, the quality degradation due to pitch epoch detection error becomes severe as the F0 modification factor becomes large. On the other hand, the vocoder framework is very flexible in F0 manipulation. The synthesized speech quality from the vocoder is far from natural human speech and suffers from buzziness. To remedy the buzzy quality from the vocoder and make more natural synthetic speech, we propose a mixed phase vocoder.

  • PDF

A Study on Voice Color Control Rules for Speech Synthesis System (음성합성시스템을 위한 음색제어규칙 연구)

  • Kim, Jin-Young;Eom, Ki-Wan
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.25-44
    • /
    • 1997
  • When listening the various speech synthesis systems developed and being used in our country, we find that though the quality of these systems has improved, they lack naturalness. Moreover, since the voice color of these systems are limited to only one recorded speech DB, it is necessary to record another speech DB to create different voice colors. 'Voice Color' is an abstract concept that characterizes voice personality. So speech synthesis systems need a voice color control function to create various voices. The aim of this study is to examine several factors of voice color control rules for the text-to-speech system which makes natural and various voice types for the sounding of synthetic speech. In order to find such rules from natural speech, glottal source parameters and frequency characteristics of the vocal tract for several voice colors have been studied. In this paper voice colors were catalogued as: deep, sonorous, thick, soft, harsh, high tone, shrill, and weak. For the voice source model, the LF-model was used and for the frequency characteristics of vocal tract, the formant frequencies, bandwidths, and amplitudes were used. These acoustic parameters were tested through multiple regression analysis to achieve the general relation between these parameters and voice colors.

  • PDF