• Title/Summary/Keyword: Improved Speech Parameter

Search Result 37, Processing Time 0.025 seconds

Background Noise Classification in Noisy Speech of Short Time Duration Using Improved Speech Parameter (개량된 음성매개변수를 사용한 지속시간이 짧은 잡음음성 중의 배경잡음 분류)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1673-1678
    • /
    • 2016
  • In the area of the speech recognition processing, background noises are caused the incorrect response to the speech input, therefore the speech recognition rates are decreased by the background noises. Accordingly, a more high level noise processing techniques are required since these kinds of noise countermeasures are not simple. Therefore, this paper proposes an algorithm to distinguish between the stationary background noises or non-stationary background noises and the speech signal having short time duration in the noisy environments. The proposed algorithm uses the characteristic parameter of the improved speech signal as an important measure in order to distinguish different types of the background noises and the speech signals. Next, this algorithm estimates various kinds of the background noises using a multi-layer perceptron neural network. In this experiment, it was experimentally clear the estimation of the background noises and the speech signals.

An Efficient Model Parameter Compensation Method foe Robust Speech Recognition

  • Chung Yong-Joo
    • MALSORI
    • /
    • no.45
    • /
    • pp.107-115
    • /
    • 2003
  • An efficient method that compensates the HMM parameters for the noisy speech recognition is proposed. Instead of assuming some analytical approximations as in the PMC, the proposed method directly re-estimates the HMM parameters by the segmental k-means algorithm. The proposed method has shown improved results compared with the conventional PMC method at reduced computational cost.

  • PDF

A Study on the Noisy Speech Recognition Based on the Data-Driven Model Parameter Compensation (직접데이터 기반의 모델적응 방식을 이용한 잡음음성인식에 관한 연구)

  • Chung, Yong-Joo
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.247-257
    • /
    • 2004
  • There has been many research efforts to overcome the problems of speech recognition in the noisy conditions. Among them, the model-based compensation methods such as the parallel model combination (PMC) and vector Taylor series (VTS) have been found to perform efficiently compared with the previous speech enhancement methods or the feature-based approaches. In this paper, a data-driven model compensation approach that adapts the HMM(hidden Markv model) parameters for the noisy speech recognition is proposed. Instead of assuming some statistical approximations as in the conventional model-based methods such as the PMC, the statistics necessary for the HMM parameter adaptation is directly estimated by using the Baum-Welch algorithm. The proposed method has shown improved results compared with the PMC for the noisy speech recognition.

  • PDF

Improved Text-to-Speech Synthesis System Using Articulatory Synthesis and Concatenative Synthesis (조음 합성과 연결 합성 방식을 결합한 개선된 문서-음성 합성 시스템)

  • 이근희;김동주;홍광석
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.369-372
    • /
    • 2002
  • In this paper, we present an improved TTS synthesis system using articulatory synthesis and concatenative synthesis. In concatenative synthesis, segments of speech are excised from spoken utterances and connected to form the desired speech signal. We adopt LPC as a parameter, VQ to reduce the memory capacity, and TD-PSOLA to solve the naturalness problem.

  • PDF

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments (잡음환경에서 Teager Energy 기반의 전역 음성부재확률을 이용하는 음성검출)

  • Park, Yun-Sik;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.97-103
    • /
    • 2012
  • In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. Global speech absence probability (GSAP) derived from likelihood ratio (LR) based on the statistical model is widely used as the feature parameter for VAD. However, the feature parameter based on conventional GSAP is not sufficient to distinguish speech from noise at low SNRs (signal-to-noise ratios). The presented VAD algorithm utilizes GSAP based on Teager energy (TE) as the feature parameter to provide the improved performance of decision for speech segments in noisy environment. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.

A Study on the Performance of TDNN-Based Speech Recognizer with Network Parameters

  • Nam, Hojung;Kwon, Y.;Paek, Inchan;Lee, K.S.;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2E
    • /
    • pp.32-37
    • /
    • 1997
  • This paper proposes a isolated speech recognition method of Korean digits using a TDNN(Time Delay Neural Network) which is able to recognizc time-varying speech properties. We also make an investigation of effect on network parameter of TDNN ; hidden layers and time-delays. TDNNs in our experiments consist of 2 and 3 hidden layers and have several time-delays. From experiment result, TDNN structure which has 2 hidden-layers, gives a good result for speech recognition of Korean digits. Mis-recognition by time-delays can be improved by changing TDNN structures and mis-recognition separated from time-delays can be improved by changing input patterns.

  • PDF

An Improved Speech Absence Probability Estimation based on Environmental Noise Classification (환경잡음분류 기반의 향상된 음성부재확률 추정)

  • Son, Young-Ho;Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.7
    • /
    • pp.383-389
    • /
    • 2011
  • In this paper, we propose a improved speech absence probability estimation algorithm by applying environmental noise classification for speech enhancement. The previous speech absence probability required to seek a priori probability of speech absence was derived by applying microphone input signal and the noise signal based on the estimated value of a posteriori SNR threshold. In this paper, the proposed algorithm estimates the speech absence probability using noise classification algorithm which is based on Gaussian mixture model in order to apply the optimal parameter each noise types, unlike the conventional fixed threshold and smoothing parameter. Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 PESQ (perceptual evaluation of speech quality) and composite measure under various noise environments. It is verified that the proposed algorithm yields better results compared to the conventional speech absence probability estimation algorithm.

Speaker Independent Recognition Algorithm based on Parameter Extraction by MFCC applied Wiener Filter Method (위너필터법이 적용된 MFCC의 파라미터 추출에 기초한 화자독립 인식알고리즘)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1149-1154
    • /
    • 2017
  • To obtain good recognition performance of speech recognition system under background noise, it is very important to select appropriate feature parameters of speech. The feature parameter used in this paper is Mel frequency cepstral coefficient (MFCC) with the human auditory characteristics applied to Wiener filter method. That is, the feature parameter proposed in this paper is a new method to extract the parameter of clean speech signal after removing background noise. The proposed method implements the speaker recognition by inputting the proposed modified MFCC feature parameter into a multi-layer perceptron network. In this experiments, the speaker independent recognition experiments were performed using the MFCC feature parameter of the 14th order. The average recognition rates of the speaker independent in the case of the noisy speech added white noise are 94.48%, which is an effective result. Comparing the proposed method with the existing methods, the performance of the proposed speaker recognition is improved by using the modified MFCC feature parameter.

Speech Enhancement Based on IMCRA Incorporating noise classification algorithm (잡음 환경 분류 알고리즘을 이용한 IMCRA 기반의 음성 향상 기법)

  • Song, Ji-Hyun;Park, Gyu-Seok;An, Hong-Sub;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.12
    • /
    • pp.1920-1925
    • /
    • 2012
  • In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA) in non-stationary noisy environment. The conventional IMCRA algorithm efficiently estimate the noise power by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. Since the minimum of smoothing parameter is defined as 0.85, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. For this reason, we proposed the modified IMCRA, which adaptively estimate and updata the noise power according to the noise type classified by the Gaussian mixture model (GMM). The performances of the proposed method are evaluated by perceptual evaluation of speech quality (PESQ) and composite measure under various environments and better results compared with the conventional method are obtained.

Noise Reduction Using MMSE Estimator-based Adaptive Comb Filtering (MMSE Estimator 기반의 적응 콤 필터링을 이용한 잡음 제거)

  • Park, Jeong-Sik;Oh, Yung-Hwan
    • MALSORI
    • /
    • no.60
    • /
    • pp.181-190
    • /
    • 2006
  • This paper describes a speech enhancement scheme that leads to significant improvements in recognition performance when used in the ASR front-end. The proposed approach is based on adaptive comb filtering and an MMSE-related parameter estimator. While adaptive comb filtering reduces noise components remarkably, it is rarely effective in reducing non-stationary noises. Furthermore, due to the uniformly distributed frequency response of the comb-filter, it can cause serious distortion to clean speech signals. This paper proposes an improved comb-filter that adjusts its spectral magnitude to the original speech, based on the speech absence probability and the gain modification function. In addition, we introduce the modified comb filtering-based speech enhancement scheme for ASR in mobile environments. Evaluation experiments carried out using the Aurora 2 database demonstrate that the proposed method outperforms conventional adaptive comb filtering techniques in both clean and noisy environments.

  • PDF