DOI QR코드

DOI QR Code

Evaluation of a signal segregation by FDBM

FDBM의 음원분리 성능평가

  • 이채봉 (동서대학교 정보시스템공학부)
  • Received : 2013.10.28
  • Accepted : 2013.12.16
  • Published : 2013.12.31

Abstract

Various approaches for sound source segregation have been proposed. Among these approaches, frequency domain binaural model(FDBM) has the advantages of low computational load and effective howling cancellation. A binaural hearing assistance system based on FDBM has been proposed. This system can enhance desired signal based on the directivity information. Although FDBM has been evaluated in terms of signal-to-noise ratio (SNR) and coherence function, the evaluation results do not always agree with the human impressions. These evaluation methods provide physical measures, and do not take account of perceptual aspect of human being. Considering a binaural hearing assistance system as a one of major applications, the quality of segregated sound should keep level enough. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and Perceptual Evaluation of Speech Quality(PESQ), to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and PESQ, to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions.

음원분리 방법으로는 여러 가지가 제안되고 있으나 그 중에서도 주파수영역 두 귀 모델(Frequency Domain Binaural Model : FDBM)은 저 연산량과 울림 제거에 효과적이다. FDBM에 의한 두 귀 보청 시스템은 SNR이나 기여도 함수(Coherence function)에 의한 평가로 하기 때문에 인간의 청취특성을 고려하지 않고 있다. 본 논문에서는 음질의 문제 해결을 위하여 FDBM의 음원분리 성능평가를 하였다. SNR, 기여도 함수, PESQ의 세 가지 수법을 이용하여 기본 특성에 대하여 시뮬레이션을 통하여 확인하였다. 모든 결과가 FDBM에 의해 좌우 채널간 평가치의 차가 작게 되었고, 좌우 채널이 거의 같은 레벨까지 개선되었음을 확인하였다. 그리고 음원방향을 바꾸고 음원 수를 증가시킨 경우에도 일정한 개선이 보였다. SNR과 기여도 함수, PESQ의 결과를 비교하면 PESQ의 평가에서는 입력 SNR를 변동시킨 경우에도 거의 모든 조건에서 분리에 의한 평가가 개선되었다.

Keywords

References

  1. Tsuyoshi Usagawa, Hirokazu Shimada, Yoshiaki Sawada, Yoshifumi Chisaki and Masanao Ebata, "A microphone array system using iterative echo suppression method as inverse filtering", Acoustical Science & Technology, Vol. 22, No. 4, pp. 315-317, 2001. https://doi.org/10.1250/ast.22.315
  2. Shoji Makino, Shoko Araki, Ryo Mukai, Hiroshi Sawada and Hiroshi Saruwatari, "ICA-based blind source separation of sounds", Proceedings of the Japan-China Joint Conference on Acoustics 2002, pp. 83-86, 2002.
  3. Tomoya Takatani, Tsuyoki Nishikawa, Hiroshi Saruwatari, "Blind Source Separation based on Binaural ICA", Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 5, pp. 321-324, 200
  4. Hidetoshi Nakashima, Yoshifumi Chisaki, Tsuyoshi Usagawa and Masanao Ebata, "Frequency domain binaural model based on interaural phase and level difference", Acoustical Science & Technology, Vol. 24, No. 4, pp. 172-178. 2003. https://doi.org/10.1250/ast.24.172
  5. Yoshifumi Chisaki, Kotaro Matsuo and Tsuyoshi Usagawa, "Howling canceler using interaural level difference for binaural hearing assistant system", Acoustical Science & Technology, Vol. 28, No. 2, pp. 90-97, 2007. https://doi.org/10.1250/ast.28.90
  6. Markus Bodden, "Modeling human soundsource localization and the cocktailpartyeffect", Acta Acoustica, Vol. 1, pp. 43-45, 1993.
  7. Takashi Nakanishi, Norifumi Sato, Hidetoshi Nakashima, Yoshifumi Chisaki, Tsuyoshi Usagawa and Masanao Ebata, "Sound Source Segregation under reverberant condition using Frequency Domain Binaural Model", Proceedings of Kyushu-Youngnam Joint Conference on Acoustics 2003, pp. 129-132, 2003.
  8. Tsuyoshi Usagawa, Rika Matsuo, Takashi Nakanishi, Hidetoshi Nakashima and Yoshifumi Chisaki, "Concurrent Speech Segregation based on DOA Information using Frequency Domain Binaural Model -An application for hearing aid-", Proceedings of International Congress on Acoustics 2004, Vol. 5, pp. 3655-3658, 2004.
  9. Chai-bong Lee, "The effect of leading tone and following tone with single frequency on sound lateralization", The Journal of the Korea Institute of Electronic Communication Sciences, Vol. 5, No. 3, pp. 251-255, 2010.
  10. ITU-T Recommendation, "Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs", p. 862, 2001.
  11. Chai-bong Lee, "The effect of a temporal masking on the sound laterlization", The Journal of the Korea Institute of Electronic Communication Sciences, Vol. 5, No. 4, pp. 352-356, 2010.
  12. Chai-bong Lee, "A study on the simplification of HRTF within low frequency region," The Journal of the Korea Institute of Electronic Communication Sciences, Vol. 5, No. 6, pp. 581-587, 2010.
  13. Bill Gardner and Keith martin, "HRTF measurements to a KEMAR dummy head microphone," MIT Media lab Perceptual Computing Technical Report#280, 1994.
  14. Eberhard Zwicker, "Subdivision od the audible frequendy rang into critical bands", Journal of the Acoustical Society of America, Vol. 33, No. 2, pp. 248, 1961. https://doi.org/10.1121/1.1908630
  15. Eberhard Zwicker, Hugo Fastl, Psychoacoustics : Facts and Models, Spring-Verlag, Berlin, 1990.
  16. The Acoustical Society of Japan, "A serial speech data base for research purpose", The Journal of the acoustical society of Japan, Vol. 48, No. 12, pp. 888-893, 1992.