DOI QR코드

DOI QR Code

Mixed Noise Cancellation by Independent Vector Analysis and Frequency Band Beamforming Algorithm in 4-channel Environments

4채널 환경에서 독립벡터분석 및 주파수대역 빔형성 알고리즘에 의한 혼합잡음제거

  • Choi, Jae-Seung (Division of Smart Electrical and Electronic Engineering, Silla University)
  • 최재승 (신라대학교 스마트전기전자공학부)
  • Received : 2019.07.25
  • Accepted : 2019.10.15
  • Published : 2019.10.31

Abstract

This paper first proposes a technique to separate clean speech signals and mixed noise signals by using an independent vector analysis algorithm of frequency band for 4 channel speech source signals with a noise. An improved output speech signal from the proposed independent vector analysis algorithm is obtained by using the cross-correlation between the signal outputs from the frequency domain delay-sum beamforming and the output signals separated from the proposed independent vector analysis algorithm. In the experiments, the proposed algorithm improves the maximum SNRs of 10.90dB and the segmental SNRs of 10.02dB compared with the frequency domain delay-sum beamforming algorithm for the input mixed noise speeches with 0dB and -5dB SNRs including white noise, respectively. Therefore, it can be seen from this experiment and consideration that the speech quality of this proposed algorithm is improved compared to the frequency domain delay-sum beamforming algorithm.

본 논문에서는 잡음이 포함된 4채널의 음원신호를 주파수 대역의 독립벡터분석 알고리즘에 의하여 깨끗한 음성신호와 혼합잡음신호를 분리하는 기법을 먼저 제안한다. 제안한 독립벡터분석 알고리즘에 의하여 분리된 음원신호를 주파수대역 지연합 빔형성기로부터 출력되는 신호와 독립벡터분석으로부터 분리된 출력신호 간의 상호 상관성을 이용하여 향상된 출력음성신호를 구한다. 본 실험에서는 백색잡음이 포함된 0dB, -5dB의 SNR의 입력 혼합잡음음성에 대하여, 본 논문에서 제안하고 있는 알고리즘이 주파수대역 지연합 빔형성기 알고리즘만을 사용하였을 때 보다 최대 10.90dB의 SNR 및 10.02dB의 Segmental SNR이 개선되었음을 확인하였다. 따라서 본 논문의 알고리즘 기법이 주파수대역 지연합 빔형성기와 비교하여 음성품질이 향상된 것을 실험 및 고찰을 통하여 확인할 수 있었다.

Keywords

References

  1. F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki, "Combined approach of array processing and independent component analysis for blind separation of acoustic signals," IEEE Transactions on Speech and Audio Processing, vol. 11, no. 3, May 2003, pp. 204-215. https://doi.org/10.1109/TSA.2003.809191
  2. J. Park, G. Son, and M. Wiranegara, "Multi-channel Video Analysis Based on Deep Learning for Video Surveillance," J. of the Korea Institute of Electronic Communication Science, vol. 13, no. 6, Dec. 2018, pp. 1263-1268. https://doi.org/10.13067/JKIECS.2018.13.6.1263
  3. S. Kang, "An Interchannel Interference Self-Cancellation Scheme for the Orthogonal Frequency Division Multiplexing System," J. of the Korea Institute of Electronic Communication Science, vol. 13, no. 4, Aug. 2018, pp.729-736. https://doi.org/10.13067/JKIECS.2018.13.4.729
  4. V. Eksler, "Evaluation of Blind Separated Signals Using Speech Recognition System," EUROCON 2005-The International Conference on "Computer as a Tool", vol. 2, Nov. 2005, pp. 1650-1653.
  5. Z. Chu and K. Bae, "Post-processing of IVA-based 2-channel blind source separation for solving frequency bin permutation problem," Phonetics and Speech Sciences, vol. 5, no. 4, Dec. 2013, pp. 211-216. https://doi.org/10.13064/KSSS.2013.5.4.211
  6. X. Wang, X. Quan, and K. Bae, "Microphone Array Based Speech Enhancement Using Independent Vector Analysis," Phonetics and Speech Sciences, vol. 4, no. 4, Dec. 2012, pp.87-92. https://doi.org/10.13064/KSSS.2012.4.4.087
  7. J. Choi, "A Blind Source Separation Method Based on Independent Vector Analysis for Separation of Speech Signal and Noise Signal," The Journal of Korean Institute of Information Technology, vol. 16, no. 10, Oct. 2018, pp. 69-74. https://doi.org/10.14801/jkiit.2018.16.10.69
  8. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing. Berlin: Springer Velag, 008.
  9. J. L. Carmona, J. Barker, A. M. Gomez, and. Ma, "Speech Spectral Envelope Enhancement by HMM-Based Analysis/Resynthesis", IEEE Signal Processing Letters, vol. 20, no. 6, June 2013, pp. 563-566. https://doi.org/10.1109/LSP.2013.2255125
  10. B. Li and K. Sim, "A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 8, June 2014, pp.1296-1305. https://doi.org/10.1109/TASLP.2014.2329237