서브밴드 가중치를 이용한 잡음에 강인한 화자검증

Noise Rabust Speaker Verification Using Sub-Band Weighting

  • 김성탁 (한국정보통신대학교 공학부) ;
  • 지미경 (한국정보통신대학교 공학부) ;
  • 김회린 (한국정보통신대학교 공학부)
  • 발행 : 2009.04.30

초록

화자검증은 발성화자가 제시화자 (claimed speaker)인지 아닌지를 구별하는 것이다. 기존의 화자검증 시스템인 GMM-UBM 방식의 화자검증 시스템은 무잡음 환경에서는 높은 검증성능을 보이지만, 잡음환경에서는 성능이 급격히 떨어지는 단점이 있다. 이런 단점을 극복하기 위해 멀티밴드를 이용한 방법인 특징벡터 재결합방법이 제안되었지만, 특징벡터 재결합방법은 전체 서브밴드 특징벡터들을 사용하여 유사도를 계산하는 단점이 있다. 이런 단점을 극복하기 위해 기 발표된 이전 논문에서 각 서브밴드 유사도를 독립적으로 계산하는 변형된 특징벡터 재결합방법을 제안하였고, 본 논문에서는 변형된 특징벡터 재결합방법과 각 서브밴드들의 신뢰도를 나타내는 신호 대 잡음비를 이용한 가중치를 이용하여 잡음환경에서 기존의 특징벡터 재결합방법에 비해 에러를 28% 감소시켰다.

Speaker verification determines whether the claimed speaker is accepted based on the score of the test utterance. In recent years, methods based on Gaussian mixture models and universal background model have been the dominant approaches for text-independent speaker verification. These speaker verification systems based on these methods provide very good performance under laboratory conditions. However, in real situations, the performance of speaker verification system is degraded dramatically. For overcoming this performance degradation, the feature recombination method was proposed, but this method had a drawback that whole sub-band feature vectors are used to compute the likelihood scores. To deal with this drawback, a modified feature recombination method which can use each sub-band likelihood score independently was proposed in our previous research. In this paper, we propose a sub-band weighting method based on sub-band signal-to-noise ratio which is combined with previously proposed modified feature recombination. This proposed method reduces errors by 28% compared with the conventional feature recombination method.

키워드

참고문헌

  1. D. Reynold, T. Quatieri, and R. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Pro-cessing, Nos. 1-3, vol. 10, pp. 19-41, 2000 https://doi.org/10.1006/dspr.1999.0361
  2. A. Drygajlo and M. El-Maliki, "Speaker verification in noisy environments with combined spectral subtraction and missing feature theory," In Proc. ICASSP, vol. 2, pp. 121-124,1998 https://doi.org/10.1109/ICASSP.1998.674382
  3. K. Yiu, M. Mak, and S. Kung, "Environment adaptation for robust speaker verification," In Proc. EUROSPEECH, pp. 2973-2976, 2003
  4. C. Barras and J. Gauvain, “Feature and score normalization for speaker verification of cellular data,” In Proc. ICASSP, vol. 2, pp. 49-52, 2003 https://doi.org/10.1109/ICASSP.2003.1202291
  5. S. Kim, M. Ji, Y. Suh, and H. Kim, “Noise Robust Speaker Identification using Sub-Band Weighting in Multi Band Approach,” IEICE Trans. Inf. & Syst., E90-D vol. 12, pp. 2110-2114, 2007 https://doi.org/10.1093/ietisy/e90-d.12.2110
  6. S. Kim, M. Ji, and H. Kim, "Noise Robust Speaker Recognition using Sub-Band Likelihoods and Reliable Feature Selection," ETRI Journal, vol. 30, no. 1, pp. 89-100, 2008 https://doi.org/10.4218/etrij.08.0107.0108
  7. 김성탁, 지미경, 김회린, "신뢰성 높은 서브밴드 특징벡터 선택을 이용한 잡음에 강인한 화자검증," 말소리, 제63호, 125-137쪽, 2007
  8. TIMIT database, TIMIT acoustic-phonetic speech corpus, Na-tional Institute of Standards and Technology (NIST), NIST speech disk, 1990
  9. D. Pearce and H. Hirsch, "The aurora experimental frame-work for the performance evaluation of speech recognition systems under noise conditions," in Proc. ICSLP, vol. 4, pp. 29-32, 2000