• Title/Summary/Keyword: 음성 존재 확률

Search Result 5, Processing Time 0.164 seconds

An Optimally-Modified Multichannel Wiener Filter Using Speech Presence Probability (음성존재확률을 이용한 최적 변형 다채널 위너 필터)

  • Jeong, Sangbae;Kim, Youngil
    • Smart Media Journal
    • /
    • v.7 no.3
    • /
    • pp.9-15
    • /
    • 2018
  • This paper proposes an optimal gain modification method of the Multichannel Wiener filter (MWF) using speech presence probabilities. Conventional gain modification methods of MWFs have the problem of the increase of speech distortions while reducing residual noises with its relative heuristic approach. However, the proposed optimal gain modification method, derived by solving the unconstrained minimization problem of the probability-involved cost function, reduces amounts of residual noises and signal distortions simultaneously. Through an evaluation of the filtered waveforms and spectrograms, it is verified that the proposed method results in an improved SNR with less signal distortions compared to the conventional MWF.

Improved speech enhancement of multi-channel Wiener filter using adjustment of principal subspace vector (다채널 위너 필터의 주성분 부공간 벡터 보정을 통한 잡음 제거 성능 개선)

  • Kim, Gibak
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.490-496
    • /
    • 2020
  • We present a method to improve the performance of the multi-channel Wiener filter in noisy environment. To build subspace-based multi-channel Wiener filter, in the case of single target source, the target speech component can be effectively estimated in the principal subspace of speech correlation matrix. The speech correlation matrix can be estimated by subtracting noise correlation matrix from signal correlation matrix based on the assumption that the cross-correlation between speech and interfering noise is negligible compared with speech correlation. However, this assumption is not valid in the presence of strong interfering noise and significant error can be induced in the principal subspace accordingly. In this paper, we propose to adjust the principal subspace vector using speech presence probability and the steering vector for the desired speech source. The multi-channel speech presence probability is derived in the principal subspace and applied to adjust the principal subspace vector. Simulation results show that the proposed method improves the performance of multi-channel Wiener filter in noisy environment.

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • /
    • pp.150-151
    • /
    • 2010
  • 본 논문에서는 잡음환경에서의 이중채널 음성인식을 위한 통계모델 기반 음성구간 검출 방법을 제안한다. 제안된 방법에서는 다채널 입력 신호로부터 얻어진 공간정보를 이용하여 음성 존재 및 부재 확률모델을 구하고 이를 통해 음성구간 검출을 행한다. 이때, 공간정보는 두 채널간의 상호 시간 차이와 상호 크기 차이로, 음성 존재 및 부재 확률은 가우시안 커널 밀도 기반의 확률모델로 표현된다. 그리고 음성구간은 각 시간 프레임 별 음성 존재 확률 대비 음성 부재 확률의 비를 추정하여 검출된다. 제안된 음성구간 검출 방법의 평가를 위해 검출된 구간만을 입력으로 하는 음성인식 성능을 측정한다. 실험결과, 제안된 공간정보를 이용하는 통계모델 기반의 음성구간 검출 방법이 주파수 에너지를 이용하는 통계모델 기반의 음성구간 검출 방법과 주파수 스펙트럼 밀도 기반 음성구간 검출 방법에 비해 각각 15.6%, 15.4%의 상대적 오인식률 개선을 보였다.

  • PDF

Intelligibility Enhancement of Multimedia Contents Using Spectral Shaping (스펙트럼 성형기법을 이용한 멀티미디어 콘텐츠의 명료도 향상)

  • Ji, Youna;Park, Young-cheol;Hwang, Young-su
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.11
    • /
    • pp.82-88
    • /
    • 2016
  • In this paper, we propose an intelligibility enhancement algorithm for multimedia contents using spectral shaping. The dialogue signals is essential to understand the plot of audio-visual media contents such as movie and TV. However, the non-dialogue components as like sound effects and background music often degrade the dialogue clarity. To overcome this problem, this paper tries to improves the dialogue clarity of audio soundtracks which contain important cues for the visual scenes. In the proposed method, the dialogue components are first detected by soft masker based on speech presence probability (SPP) which is widely used in speech enhancement field. Then, extracted dialogue signals are applied to the spectral shaping method. It reallocate the spectral-temporal energy of speech to enhanced the intelligibility. The total energy is maintained as unchanged via a loudness normalization process to prevent saturation. The algorithm was evaluated using the modeled and real movie soundtracks and it was shown that the proposed algorithm enhances the dialogue clarity while preserving the total audio power.

Low-Complexity Speech Enhancement Algorithm Based on IMCRA Algorithm for Hearing Aids (보청기를 위한 IMCRA 기반 저연산 음성 향상 알고리즘)

  • Jeon, Yuyong;Lee, Sangmin
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.4
    • /
    • pp.363-370
    • /
    • 2017
  • In this paper, we proposed a low-complexity speech enhancement algorithm based on a improved minima controlled recursive averaging (IMCRA) and log minimum mean square error (logMMSE). The IMCRA algorithm track the minima value of input power within buffers in local window and identify the speech presence using ratio between input power and its minima value. In this process, many number of operations are required. To reduce the number of operations of IMCRA algorithm, minima value is tracked using time-varying frequency-dependent smoothing based on speech presence probability. The proposed algorithm enhanced speech quality by 2.778%, 3.481%, 2.980% and 2.162% in 0, 5, 10 and 15dB SNR respectively and reduced computational complexity by average 9.570%.