• Title, Summary, Keyword: Speech Enhancement

Search Result 314, Processing Time 0.044 seconds

Speech Enhancement Using Lip Information and SFM (입술정보 및 SFM을 이용한 음성의 음질향상알고리듬)

  • Baek, Seong-Joon;Kim, Jin-Young
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.77-84
    • /
    • 2003
  • In this research, we seek the beginning of the speech and detect the stationary speech region using lip information. Performing running average of the estimated speech signal in the stationary region, we reduce the effect of musical noise which is inherent to the conventional MlMSE (Minimum Mean Square Error) speech enhancement algorithm. In addition to it, SFM (Spectral Flatness Measure) is incorporated to reduce the speech signal estimation error due to speaking habit and some lacking lip information. The proposed algorithm with Wiener filtering shows the superior performance to the conventional methods according to MOS (Mean Opinion Score) test.

  • PDF

Speech enhancement system using the multi-band coherence function and spectral subtraction method (다중 주파수 밴드 간섭함수와 스펙트럼 차감법을 이용한 음성 향상 시스템)

  • Oh, Inkyu;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.4
    • /
    • pp.406-413
    • /
    • 2019
  • This paper proposes a speech enhancement method through the process of combining the gain function with spectrum subtraction method in the two microphone array with close spacing. A speech enhancement method that uses a gain function estimated by the SNR (Signal-to Noise Ratio) based on the multi frequency band coherence function causes the performance degradation in high correlation between input noises of two channels. A new speech enhancement method is proposed where the weighted gain function is used by combining the gain function from the spectral subtraction. The performance evaluation of the proposed method was shown by comparison with PESQ (Perceptual Evaluation of Speech Quality) value which is an objective quality evaluation test provided by the ITU-T (International Telecommunications Union Telecommunication). In the PESQ tests, the maximum 0.217 of PESQ value is improved in the various background noise environments.

Speech enhancement method based on feature compensation gain for effective speech recognition in noisy environments (잡음 환경에 효과적인 음성인식을 위한 특징 보상 이득 기반의 음성 향상 기법)

  • Bae, Ara;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.51-55
    • /
    • 2019
  • This paper proposes a speech enhancement method utilizing the feature compensation gain for robust speech recognition performances in noisy environments. In this paper we propose a speech enhancement method utilizing the feature compensation gain which is obtained from the PCGMM (Parallel Combined Gaussian Mixture Model)-based feature compensation method employing variational model composition. The experimental results show that the proposed method significantly outperforms the conventional front-end algorithms and our previous research over various background noise types and SNR (Signal to Noise Ratio) conditions in mismatched ASR (Automatic Speech Recognition) system condition. The computation complexity is significantly reduced by employing the noise model selection technique with maintaining the speech recognition performance at a similar level.

Rao-Blackwellized Particle Filtering for Sequential Speech Enhancement (Rao-Blackwellized particle filter를 이용한 순차적 음성 강조)

  • Park Sun-Ho;Choi Seun-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.151-153
    • /
    • 2006
  • we present a method of sequential speech enhancement, where we infer clean speech signal using a Rao-Blackwellized particle filter (RBPF), given a noise-contaminated observed signal. In contrast to Kalman filtering-based methods, we consider a non-Gaussian speech generative model that is based on the generalized auto-regressive (GAR) model. Model parameters are learned by a sequential Newton-Raphson expectation maximization (SNEM), incorporating the RBPF. Empirical comparison to Kalman filter, confirms the high performance of the proposed method.

  • PDF

A User-friendly Remote Speech Input Method in Spontaneous Speech Recognition System

  • Suh, Young-Joo;Park, Jun;Lee, Young-Jik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.2E
    • /
    • pp.38-46
    • /
    • 1998
  • In this paper, we propose a remote speech input device, a new method of user-friendly speech input in spontaneous speech recognition system. We focus the user friendliness on hands-free and microphone independence in speech recognition applications. Our method adopts two algorithms, the automatic speech detection and the microphone array delay-and-sum beamforming (DSBF)-based speech enhancement. The automatic speech detection algorithm is composed of two stages; the detection of speech and nonspeech using the pitch information for the detected speech portion candidate. The DSBF algorithm adopts the time domain cross-correlation method as its time delay estimation. In the performance evaluation, the speech detection algorithm shows within-200 ms start point accuracy of 93%, 99% under 15dB, 20dB, and 25dB signal-to-noise ratio (SNR) environments, respectively and those for the end point are 72%, 89%, and 93% for the corresponding environments, respectively. The classification of speech and nonspeech for the start point detected region of input signal is performed by the pitch information-base method. The percentages of correct classification for speech and nonspeech input are 99% and 90%, respectively. The eight microphone array-based speech enhancement using the DSBF algorithm shows the maximum SNR gaing of 6dB over a single microphone and the error reductin of more than 15% in the spontaneous speech recognition domain.

  • PDF

Noisy Speech Enhancement Based on Complex Laplacian Probability Density Function (복소 라플라시안 확률 밀도 함수에 기반한 음성 향상 기법)

  • Park, Yun-Sik;Jo, Q-Haing;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.6
    • /
    • pp.111-117
    • /
    • 2007
  • This paper presents a novel approach to speech enhancement based on a complex Laplacian probability density function (pdf). With a use of goodness-of-fit (GOF) test we show that the complex Laplacian pdf is more suitable to describe the conventional Gaussian pdf. The likelihood ratio (LR) is applied to derive the speech absence probability in the speech enhancement algorithm. The performance of the proposed algorithm is evaluated by the objective test and yields better results compared with the conventional Gaussian pdf-based scheme.

A single-channel speech enhancement method based on restoration of both spectral amplitudes and phases for push-to-talk communication (Push-to-talk 통신을 위한 진폭 및 위상 복원 기반의 단일 채널 음성 향상 방식)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.1
    • /
    • pp.64-69
    • /
    • 2017
  • In this paper, we propose a single-channel speech enhancement method based on restoration of both spectral amplitudes and phases for PTT (Push-To-Talk) communication. The proposed method combines the spectral amplitude and phase enhancement to provide high-quality speech unlike other single-channel speech enhancement methods which only use spectral amplitudes. We carried out side-by-side comparison experiment in various non-stationary noise environments in order to evaluate the performance of the proposed method. The experimental results show that the proposed method provides high quality speech better than other methods under different noise conditions.

Implementation of Chip and Algorithm of a Speech Enhancement for an Automatic Speech Recognition Applied to Telematics Device (텔레메틱스 단말용 음성 인식을 위한 음성향상 알고리듬 및 칩 구현)

  • Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.7 no.5
    • /
    • pp.90-96
    • /
    • 2008
  • This paper presents an algorithm of a single chip acoustic speech enhancement for telematics device. The algorithm consists of two stages, i.e. noise reduction and echo cancellation. An adaptive filter based on cross spectral estimation is used to cancel echo. The external background noise is eliminated and the clear speech is estimated by using MMSE log-spectral magnitude estimation. To be suitable for use in consumer electronics, we also design a low cost, high speed and flexible hardware architecture. The performance of the proposed speech enhancement algorithms were measured both by the signal-to-noise ratio(SNR) and recognition accuracy of an automatic speech recognition(ASR) and yields better results compared with the conventional methods.

  • PDF

A Probabilistic Combination Method of Minimum Statistics and Soft Decision for Robust Noise Power Estimation in Speech Enhancement (강인한 음성향상을 위한 Minimum Statistics와 Soft Decision의 확률적 결합의 새로운 잡음전력 추정기법)

  • Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.153-158
    • /
    • 2007
  • This paper presents a new approach to noise estimation to improve speech enhancement in non-stationary noisy environments. The proposed method combines the two separate noise power estimates provided by the minimum statistics (MS) for speech presence and soft decision (SD) for speech absence in accordance with SAP (Speech Absence Probability) on a separate frequency bin. The performance of the proposed algorithm is evaluated by the subjective test under various noise environments and yields better results compared with the conventional MS or SD-based schemes.

Speech Enhancement in Noisy Speech Using Neural Network (신경회로망을 사용한 잡음이 중첩된 음성 강조)

  • Choi, Jae-Seung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.5
    • /
    • pp.165-172
    • /
    • 2005
  • In speech recognition under a noisy environment, it is necessary to construct a system which reduces the noise and enhances the speech. Then it is effective to imitate the human auditory system which has an excellent analytical spectrum mechanism for speech enhancement. Accordingly, this paper proposes an adaptive method using the auditory mechanism which is called lateral inhibition. This method first estimates the noise intensity by neural network, then adaptively adjusts both the coefficients of the lateral inhibition and the adjusting coefficient of amplitude component according to the noise intensity for each input frame. It is confirmed that the proposed method is effective for speech degraded by white noise, colored noise, and road noise based on the spectral distortion measurement.