• Title/Summary/Keyword: Speech

Search Result 7,683, Processing Time 0.034 seconds

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Journal of Audiology & Otology
    • /
    • v.23 no.4
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Korean Journal of Audiology
    • /
    • v.23 no.4
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

Characteristics of speech intelligibility and speech acceptability connected with mouth opening condition (구강 개방 상태에 따른 말 명료도 및 말 용인도 특성)

  • Song, Yun-Kyung
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.141-148
    • /
    • 2011
  • There are many factors that affect speech intelligibility and speech acceptability. Structural anomalies and neuromotor pathologies are known for the reasons of abnormal speech sounds. And there are minor variations related to oral mechanism. Speaking with restricted mouth opening related to therapeutic procedure or habitual speech pattern might affect the quality of speech sounds. So this study compared speech intelligibility and speech acceptability of recorded 24 words in two conditions (restricted mouth opening condition and normal mouth opening condition) by 30 normal hearing adults. The results showed that speech intelligibility and speech acceptability were significantly lower in restricted mouth opening condition. And speech acceptability was significantly lower than speech intelligibility in restricted mouth opening condition. Speech acceptability in restricted mouth opening condition was significantly lower especially in open vowel. These findings indicated that the mouth opening condition could affect vowel shape and could be an adverse effect on speech intelligibility and speech acceptability.

  • PDF

Performance Comparison of the Speech Enhancement Methods for Noisy Speech Recognition (잡음음성인식을 위한 음성개선 방식들의 성능 비교)

  • Chung, Yong-Joo
    • Phonetics and Speech Sciences
    • /
    • v.1 no.2
    • /
    • pp.9-14
    • /
    • 2009
  • Speech enhancement methods can be generally classified into a few categories and they have been usually compared with each other in terms of speech quality. For the successful use of speech enhancement methods in speech recognition systems, performance comparisons in terms of speech recognition accuracy are necessary. In this paper, we compared the speech recognition performance of some of the representative speech enhancement algorithms which are popularly cited in the literature and used widely. We also compared the performance of speech enhancement methods with other noise robust speech recognition methods like PMC to verify the usefulness of speech enhancement approaches in noise robust speech recognition systems.

  • PDF

Performance Enhancement of Speech Intelligibility in Communication System Using Combined Beamforming (directional microphone) and Speech Filtering Method (방향성 마이크로폰과 음성 필터링을 이용한 통신 시스템의 음성 인지도 향상)

  • Shin, Min-Cheol;Wang, Se-Myung
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2005.05a
    • /
    • pp.334-337
    • /
    • 2005
  • The speech intelligibility is one of the most important factors in communication system. The speech intelligibility is related with speech to noise ratio. To enhance the speech to noise ratio, background noise reduction techniques are being developed. As a part of solution to noise reduction, this paper introduces directional microphone using beamforming method and speech filtering method. The directional microphone narrows the spatial range of processing signal into the direction of the target speech signal. The noise signal located in the same direction with speech still remains in the processing signal. To sort this mixed signal into speech and noise, as a following step, a speech-filtering method is applied to pick up only the speech signal from the processed signal. The speech filtering method is based on the characteristics of speech signal itself. The combined directional microphone and speech filtering method gives enhanced performance to speech intelligibility in communication system.

  • PDF

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

  • Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
    • MALSORI
    • /
    • no.54
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

Speech Rate Variation in Synchronous Speech (동시발화에 나타나는 발화 속도 변이 분석)

  • Kim, Miran;Nam, Hosung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.19-27
    • /
    • 2012
  • When two speakers read a text together, the produced speech has been shown to reduce a high degree of variability (e.g., pause duration and placement, and speech rate). This paper provides a quantitative analysis of speech rate variation exhibited in synchronous speech by examining the global and local patterns in two dialects of Mandarin Chinese (Taiwan and Shanghai). We analyzed the speech data in terms of mean speech rate and the reference of "Just Noticeable difference (JND)" within a subject and across subjects. Our findings show that speakers show lower and less variable speech rates when they read a text synchronously than when they read alone. This global pattern is observed consistently across speakers and dialects maintaining the unique local variation patterns of speech rate for each dialect. We conclude that paired speakers lower their speech rates and decrease the variability in order to ensure the synchrony of their speech.

Speech Rates of Male Esophageal Speech (식도발성 남성 발화의 말 속도)

  • Park, Won-Kyoung;Shim, Hee-Jeong;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.143-149
    • /
    • 2012
  • The purpose of this study is to investigate the speech rate of an esophageal speech group that is capable of vocalization after surgery. The subjects in this experiment were 10 male esophageal speakers and 10 male laryngeal speakers. Each group read a reading passage that was recorded by a DAT recorder (Rolando, EDIROL R-09). These records were analyzed by using CSL (Computerized Speech Lab, model 4150). The results were as follows: (1) the overall speech rate of esophageal speech was 2.50 SPS (syllable per second) while the overall speech rate of laryngeal speech was 4.23 SPS. (2) The articulatory rate of esophageal speech was 3.14 SPS (syllable per second) while the articulatory rate of laryngeal speech was 4.75 SPS. Speech rates as well as articulatory rates of esophageal speech were significantly lower than laryngeal speech. These differences between the two groups may be due to reduced efficiency of airflows across the pharyngeal-esophageal segment for esophageal speakers when compared to airflow through the glottis for laryngeal speakers. These results would provide a guideline in speech rates for esophageal speakers in clinical settings.

An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease (파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성)

  • Shin, Hee Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

The Correlation between Speech Intelligibility and Acoustic Measurements in Children with Speech Sound Disorders (말소리장애 아동의 말명료도와 음향학적 측정치 간 상관관계)

  • Kang, Eunyeong
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.6 no.4
    • /
    • pp.191-206
    • /
    • 2018
  • Purpose : This study investigated the correlation between speech intelligibility and acoustic measurements of speech sounds produced by the children with speech sound disorders and children without any diagnosed speech sound disorder. Methods : A total of 60 children with and without speech sound disorders were the subjects of this study. Speech samples were obtained by having the subjects? speak meaningful words. Acoustic measurements were analyzed on a spectrogram using the Multi-speech 3700 program. Speech intelligibility was determined according to a listener's perceptual judgment. Results : Children with speech sound disorders had significantly lower speech intelligibility than those without speech sound disorders. The intensity of the vowel /u/, the duration of the vowel /${\omega}$/, and the second formant of the vowel /${\omega}$/ were significantly different between both groups. There was no difference in voice onset time between the groups. There was a correlation between acoustic measurements and speech intelligibility. Conclusion : The results of this study showed that the speech intelligibility of children with speech sound disorders was affected by intensity, word duration, and formant frequency. It is necessary to complement clinical setting results using acoustic measurements in addition to evaluation of speech intelligibility.