• Title/Summary/Keyword: Subjective Speech Quality Evaluation

Search Result 35, Processing Time 0.027 seconds

Correlation Analysis of PESQ and MOS Evaluation for HMM-based Synthetic Korean Speech (HMM 기반의 한국어 합성음에 대한 PESQ 및 MOS 평가의 상관도 분석)

  • Lin, Cang-Song;Bae, Keun-Sung
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 2010
  • The PESQ is an objective speech quality evaluation measure that is known to have a high correlation with a subjective speech quality measure such as MOS. To examine whether it could be useful as an objective quality measure of synthetic speech, we carried out both subjective evaluation tests with MOS and DMOS and an objective evaluation test with PESQ for HMM-based Korean synthetic speech signals and analyzed the correlation between them. Experimental results have shown that the PESQ has correlations of 0.87 with MOS and 0.92 with DMOS. It means that the PESQ holds much promise for evaluating the quality of synthetic Korean speech.

  • PDF

A Study on Objective Quality Assessment for Synthesized speech by Rule (규칙합성음의 객관적 품질평가에 관한 연구)

  • 홍진우;김순협
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.10
    • /
    • pp.42-49
    • /
    • 1993
  • In this paper, we evaluate the quality of synthesized speech by rule using the LPC CD as a objective measure, and then compare the test result with the subjective one. Speech used for the test consists of 108 words which are selected by word construction method using Korean attribute and frequency distribution, synthesized by demi-syllable rule. By evaluating the quality of synthesized speech by reule objectively, we have tried to resolve the problems such as lots of evaluation time, expansion of test scale, and variables of analysis result arised by subjective measure. We have, also, proved the validity of the objective test using the LPC CD, by comparing intelligibility which is the index for the subjective quality evaluation of synthesized speech by rule with MOS. From this results, we can provide a guide for quality assessment that would be useful in the R&D of synthesis method and the commercial products using synthesized speech.

  • PDF

A Study on Objective Quality Assessment of Synthesized Speech by Rule (규칙 합성음의 객관적 품질평가에 관한 연구)

  • 홍진우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1991.06a
    • /
    • pp.67-72
    • /
    • 1991
  • This paper evaluates thequality of synthesized speech by rule using the LPC CD in the objective measure and then compares the result with the subjective analysis. By evaluating the quality of synthesized speech by rule objectively. We have tried to resolve the problems (Evaluation time or size expansion, variables within the analysis results) that arise when the evaluation is done subjectively. Also by comparing intelligibility-the index for the subjective quality evaluation of synthesized speech by rule-with evaluation results obtained using MOS and the objective evaluation. We have proved the validity of the objective analysis and thus provides a guide that would be useful when R&D and marketing of synthesis by rule method is done.

  • PDF

A Study of Subjective Speech Quality Measurement in VoIP (VoIP 음질의 주관적 평가에 관한 연구)

  • 강영도;강진석;최연성;김장형
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.2
    • /
    • pp.279-287
    • /
    • 2001
  • In this paper, we discuss the scale of subjective speech quality measurement over VoIP(Voice over IP) network which is a component of broadband networks. Objective parameters of multimedia services like PSNR or jitter can easily measured and defined, but these factors are not easily meet the user's perceptual recognition. We suggest the speech quality measurement scale through the subjective measurement for end-to-end speech quality composed of sender-side quality, transmission quality, receiver-side quality, which provide the degree of correctness of representation of speaker, the degree of impairment caused by various factors, the degree of recognition of processed speech, respectively. Also, we examined the proposed method and verify it's availability.

  • PDF

A Study on the Objective Evaluation Model of Telephone Transmission Quality (통화품질 객관평가 모델링에 관한 연구)

  • 조재철;박순영;방만원
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.6
    • /
    • pp.509-516
    • /
    • 1991
  • In this paper, we propose on objective evaluation model of telephone transmission qulity in order to estimate a satisfaction score regarding speech quality in a relephone network. As the degradantion factors of telephone transmission quality, this model takes into account transmission loss, noise, distortion, talker echo and sidetone. A performance index[PI] is introduced for five psychological factors affecting telephone speech qualty, and a Mean Opinion Score(MOS) is estimated from the sum of all Pis. The simulation results indicate theat the MOS obtained from the objective evaluation model is in good agreement with that of subjective evaluation.

  • PDF

Performance Evaluation of Novel AMDF-Based Pitch Detection Scheme

  • Kumar, Sandeep
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.425-434
    • /
    • 2016
  • A novel average magnitude difference function (AMDF)-based pitch detection scheme (PDS) is proposed to achieve better performance in speech quality. A performance evaluation of the proposed PDS is carried out through both a simulation and a real-time implementation of a speech analysis-synthesis system. The parameters used to compare the performance of the proposed PDS with that of PDSs that are based on either a cepstrum, an autocorrelation function (ACF), an AMDF, or circular AMDF (CAMDF) methods are as follows: percentage gross pitch error (%GPE); a subjective listening test; an objective speech quality assessment; a speech intelligibility test; a synthesized speech waveform; computation time; and memory consumption. The proposed PDS results in lower %GPE and better synthesized speech quality and intelligibility for different speech signals as compared to the cepstrum-, ACF-, AMDF-, and CAMDF-based PDSs. The computational time of the proposed PDS is also less than that for the cepstrum-, ACF-, and CAMDF-based PDSs. Moreover, the total memory consumed by the proposed PDS is less than that for the ACF- and cepstrum-based PDSs.

Bandwidth Expansion Method Using Spline Codebook Based Spectral Folding (Spline 코드북 기반의 spectral folding을 이용한 대역폭 확장 방법)

  • Park, Ji-Hoon;Han, Seung-Ho;Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.131-134
    • /
    • 2006
  • Quality of narrowband speech $(0{\sim}4kHz)$ can be enhanced by the bandwidth expansion technique, by which the high- band components are estimated. This paper proposes the bandwidth expansion method using the spline codebook based spectral folding. For the performance evaluation, the PESQ(Perceptual Evaluation of Speech Quality) scores are measured as the objective measurement In addition, the MOS (Mean Opinion Score) and the preference tests are performed as the subjective measurement. The results show our proposed method outperforms the existing spline based one.

  • PDF

Classical Tamil Speech Enhancement with Modified Threshold Function using Wavelets

  • Indra., J;Kasthuri., N;Navaneetha Krishnan., S
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.6
    • /
    • pp.1793-1801
    • /
    • 2016
  • Speech enhancement is a challenging problem due to the diversity of noise sources and their effects in different applications. The goal of speech enhancement is to improve the quality and intelligibility of speech by reducing noise. Many research works in speech enhancement have been accomplished in English and other European Languages. There has been limited or no such works or efforts in the past in the context of Tamil speech enhancement in the literature. The aim of the proposed method is to reduce the background noise present in the Tamil speech signal by using wavelets. New modified thresholding function is introduced. The proposed method is evaluated on several speakers and under various noise conditions including White Gaussian noise, Babble noise and Car noise. The Signal to Noise Ratio (SNR), Mean Square Error (MSE) and Mean Opinion Score (MOS) results show that the proposed thresholding function improves the speech enhancement compared to the conventional hard and soft thresholding methods.

A Study on Development of a Hearing Impairment Simulator considering Frequency Selectivity and Asymmetrical Auditory Filter of the Hearing Impaired (난청인의 주파수 선택도와 비대칭적 청각 필터를 고려한 난청 시뮬레이터 개발에 관한 연구)

  • Joo, Sang-Ick;Kang, Hyun-Deok;Song, Young-Rok;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.4
    • /
    • pp.831-840
    • /
    • 2010
  • In this paper, we propose a hearing impairment simulator considering reduced frequency selectivity and asymmetrical auditory filter of the hearing impaired, and we verified the reduced frequency selectivity and asymmetrical auditory filter affected in speech perception through experiments. The reduced frequency selectivity has made embodied by spectral smearing using LPC(linear prediction coding). The shapes of auditory filter are asymmetrical different with each center frequency. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. The experiments confirmed subjective test and objective test. The subjective experiments are composed of 4 kinds of tests: pure tone test, SRT(speech reception threshold) test, and WRS(word recognition score) test without spectral smearing, and WRS test with spectral smearing. The experiment of the hearing impairment simulator was performed from 9 subjects who have normal ears. The amount of spectral smearing was controlled by LPC order. The asymmetrical auditory filter of proposed hearing impairment simulator was simulated and then some tests to estimate the filter's performance objectively were performed. The objective experiment as simulated auditory filter's performance evaluation method used PESQ(perceptual evaluation of speech quality) and LLR(log likelihood ratio) for speech through auditory filter. The processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference according to asymmetrical auditory filter in hearing impairment simulator.

Implementation and Evaluation of an HMM-Based Speech Synthesis System for the Tagalog Language

  • Mesa, Quennie Joy;Kim, Kyung-Tae;Kim, Jong-Jin
    • MALSORI
    • /
    • v.68
    • /
    • pp.49-63
    • /
    • 2008
  • This paper describes the development and assessment of a hidden Markov model (HMM) based Tagalog speech synthesis system, where Tagalog is the most widely spoken indigenous language of the Philippines. Several aspects of the design process are discussed here. In order to build the synthesizer a speech database is recorded and phonetically segmented. The constructed speech corpus contains approximately 89 minutes of Tagalog speech organized in 596 spoken utterances. Furthermore, contextual information is determined. The quality of the synthesized speech is assessed by subjective tests employing 25 native Tagalog speakers as respondents. Experimental results show that the new system is able to obtain a 3.29 MOS which indicates that the developed system is able to produce highly intelligible neutral Tagalog speech with stable quality even when a small amount of speech data is used for HMM training.

  • PDF