Statistical Voice Activity Defector Based on Signal Subspace Model

Ryu, Kwang-Chun;Kim, Dong-Kook;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 27 Issue 7
/
Pages.372-378
/
2008
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

Statistical Voice Activity Defector Based on Signal Subspace Model

신호 준공간 모델에 기반한 통계적 음성 검출기

류광춘 (전남대학교 전자컴퓨터공학과) ;
김동국 (전남대학교 전자컴퓨터공학과)

Published : 2008.10.31

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.

음성 검출기 (VAD, Voice Activity Detector)는 이동 통신이나 음성신호처리 등에 매우 중요한 기법으로 사용된다. 일반적인 음성 검출방식은 이산 푸리에 변환 (DFT, Discrete Fourier Transform)영역에서 통계적인 모델을 기반으로 하여 우도비검정 (LRT, Likelihood Ratio Test)을 하게 된다. 그리고 이 값을 임계값과 비교하며 음성인지 아닌지 판단하게 된다. 본 논문에서는 신호 준공간 (Signal Subspace)에 기반한 새로운 통계적 음성 검출 기법을 제안하다. 확률적인 주성분 분석 (PPCA, Probabilistic Principal Component Analysis)은 신호 준공간 방법에서 잡음신호에 대한 확률적인 모델을 얻기 위해 사용된다. 제안된 기법은 신호 준공간 영역에서 우도비검정에 기반을 두는 결정규칙을 적용하였다. 음성 검출 실험 결과는 신호 준공간 모델에 근거한 음성 검출기 기법이 주파수 영역에 기반한 가우시안 (Gaussian) 음성 검출기 보다 향상된 검출 결과를 보여준다.

Keywords

References

A. Dvis, S. Nordholm and R. Togneri, "Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold," IEEE Trans. Audio, Speech, and Language Processing, 14(2), 412-424, March 2006
J. S. Sohn, N. S. Kim and W. Y. Sung, "A stistical Model -Based Voice Activity Detection," IEEE Signal pocessing Lett., 6(1), 1-3, Jan. 1999
N. S. Kim, and J. -H. Chang, "Spectral Enhancement Based on Global Soft Decision," IEEE Signal Process. Lett,, 7(5), 108-110, 2000
J. -H. Chang, J. W. Shin and N. S. Kim "Voice Activity Detector Employing Generalized Gaussian Distribution," IEEE Electronics Lett. 40(24), 1561-1563, Nov. 2004
J. -H. Chang, N. S. Kim and S. K. Mitra, "Voice Activity Detection Based on Multiple Statistical Models," IEEE Trans. Signal Proc., 54(6), 1965-1976, June 2006
S. Gazor and W. Zhang, "A Soft Voice Activity Detector Based on a Laplacian-Gaussian Model," IEEE Trans. Speech and Audio Proc., 11(5), 498-505, Sept. 2003
P. Loizou, Speech Enhancement : Theory and Practice, CRC Press. 2007
Y. Ephraim and H. L. Van Tress, "A Signal Subspace Approach for Speech Enhancement," IEEE Trans. Speech and Audio Proc., 3(4), 251-266, July 1995 https://doi.org/10.1109/89.397090
F. Jabloun and B. Champagne, "Incorporating the Human Hearing Properties in the Signal Subspace Approach for Speech Enhancement," IEEE Trans. Speech and Audio Proc., 11(6), 700-708, Nov. 2003
K. Hermus, P. Wambacq, and H. V. Hamme, "A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition," EURASIP Journal on Advances in Signal Procesing, 2007, Article D 45821, 15 pages, 2007
M. Tipping and C. Bishop, "Mixtures of probabilistic principal component analyzers," Neural Computation, 11, 435-474, 1999
A. P. Dempster, N. M. Laird and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," Journal of the Royal Statistical Society, 39, 1-38, 1977
Y. Ephraim and D. Malah, "Speech Enhancement Using A Minimum Mean-square Error Short-time Spectral Amplitude Estimator," IEEE Trans. Acoust., Speech, Signal Proc., ASSP -32, 1109-1121, Dec. 1984
A.Varga and H.J.M. Steeneken, "Assessment for Automatic Speech Recognition: II.NOISEX-92: A Database and An Experiment to Study The Effect of Additive Noise on Speech Recognition Systems," Speech Communication, 12(3), 247- 251, Jul.1993 https://doi.org/10.1016/0167-6393(93)90095-3
강상익, 조규행, 박승섭, 장준혁, "통계적 모델 기반의 음성 검출 기를 위한 변별적 가충치 학습," 한국음향학회지, 26(5), 194- 198, 2007년 7월
장근원, 장준혁, 김동국, "UMP 테스트에 근거한 새로운 통계적 음성검출기," 한국음향학회지,26(1), 16-24, 2007년 1월
S. Roweis,"EM Algorithms for PCA and SPCA,"Neural Inform. Process. System, 10, 626-632, 1997

The Journal of the Acoustical Society of Korea (한국음향학회지)

Statistical Voice Activity Defector Based on Signal Subspace Model

신호 준공간 모델에 기반한 통계적 음성 검출기

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)