DOI QR코드

DOI QR Code

Optimization of the Number of Filter in CNN Noise Attenuator

CNN 잡음감쇠기에서 필터 수의 최적화

  • Lee, Haeng-Woo (Intelligent Information Communication Engineering, Namseoul University)
  • 이행우 (남서울대학교 지능정보통신공학과)
  • Received : 2021.07.14
  • Accepted : 2021.08.17
  • Published : 2021.08.31

Abstract

This paper studies the effect of the number of filters in the CNN (Convolutional Neural Network) layer on the performance of a noise attenuator. Speech is estimated from a noised speech signal using a 64-neuron, 16-kernel CNN filter and an error back-propagation algorithm. In this study, in order to verify the performance of the noise attenuator with respect to the number of filters, a program using Keras library was written and simulation was performed. As a result of simulation, it can be seen that this system has the smallest MSE (Mean Squared Error) and MAE (Mean Absolute Error) values when the number of filters is 16, and the performance is the lowest when there are 4 filters. And when there are more than 8 filters, it was shown that the MSE and MAE values do not differ significantly depending on the number of filters. From these results, it can be seen that about 8 or more filters must be used to express the characteristics of the speech signal.

본 논문은 잡음감쇠기에서 CNN(Convolutional Neural Network) 계층의 필터 수가 성능에 미치는 영향을 연구하였다 이 시스템은 적응필터 대신 신경망 예측필터를 이용하며 심층학습방법으로 잡음을 감쇠한다. 64-뉴런, 16-커널 CNN 필터와 오차 역전파 알고리즘을 이용하여 잡음이 포함된 음성신호로부터 음성을 추정한다. 본 연구에서 필터 수에 대한 잡음감쇠기의 성능을 검증하기 위하여 Keras 라이브러리를 사용한 프로그램을 작성하고 시뮬레이션을 실시하였다. 시뮬레이션 결과, 본 시스템은 필터 수가 16일 때 MSE(Mean Squared Error) 및 MAE(Mean Absolute Error) 값이 가장 작은 것으로 나타났으며 필터가 4개 일 때 성능이 가장 낮은 것을 볼 수 있다. 그리고 필터가 8개 이상이 되면 필터 수에 따라 MSE 및 MAE 값이 크게 차이나지 않는 것을 보여주었다. 이러한 결과로부터 음성신호의 주요 특징을 표현하기 위해서는 약 8개 이상의 필터를 사용해야 한다는 것을 알 수 있다.

Keywords

Acknowledgement

이 논문은 2021년도 남서울대학교 학술연구비 지원에 의해 연구되었음.

References

  1. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, Apr. 1979, pp. 113-120. https://doi.org/10.1109/TASSP.1979.1163209
  2. A. Schaub and P. Schaub, "Spectral sharpening for speech enhancement/noise reduction," In Proc. of Int. Conf. on Acoust., Speech, Signal Processing, vol. 2, May 1991, pp. 993-996.
  3. J. Lim and A. V. Oppenheim, "All-pole modeling of degraded speech," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-26, June 1978, pp. 197-210.
  4. J. Hansen and M. Clements, "Constrained iterative speech enhancement with to speech recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-39, no. 4, Apr. 1989, pp. 21-27.
  5. J. Choi, "Noise Reduction Algorithm in Speech by Wiener Filter," J. of the Korea Institute of Electronic Communication Sciences, vol. 8, Sept. 2013, pp. 1293-1298. https://doi.org/10.13067/JKIECS.2013.8.9.1293
  6. J. Lim, A. V. Oppenheim, and L. D. Braida, "Evaluation of an adaptive comb filtering method for enhancing speech degraded by white noise addition," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-26, no. 4, Apr. 1991, pp. 354-358.
  7. S. F. Boll and D. C. Pulsipher, "Suppression of acoustic noise in speech using two microphone adaptive noise cancellation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, no. 6, Dec. 1989, pp. 752-753.
  8. W. A. Harrison, J. Lim and E. Singer, "A new application of adaptive noise cancellation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, Feb. 1986, pp. 21-27.
  9. O. Kwon, "Study on Efficient Adaptive Controller for Attenuation of Engine Noises in a Car," J. of the Korea Institute of Electronic Communication Sciences, vol. 9, Sept. 2014, pp. 983-989. https://doi.org/10.13067/JKIECS.2014.9.9.983
  10. J. Schmidhuber, "Deep learning in neural networks: An overview," Neural Networks, vol. 61, 2015, pp. 85-117. https://doi.org/10.1016/j.neunet.2014.09.003
  11. J. Choi, "Speech and Noise Recognition System by Neural Network," J. of the Korea Institute of Electronic Communication Sciences, vol. 5, no. 4, Aug. 2010, pp. 357-362.
  12. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, vol. 86, no. 11, Nov. 1998, pp. 2278-2324. https://doi.org/10.1109/5.726791
  13. P. B. Patil, "Multilayered network for LPC based speech recognition," IEEE Transactions on Consumer Electronics, vol. 44, no. 2, 1998, pp. 435-438. https://doi.org/10.1109/30.681960
  14. D. Rumelhart, G. Hinton, and R. Williams, "Learning representations by back-propagating errors," Cognitive modeling, vol. 5, 1988, pp. 3.
  15. M. Jang and J. Kong, "Association Analysis of Convolution Layer, Kernel and Accuracy in CNN," J. of the Korea Institute of Electronic Communication Sciences, vol. 14, no. 6, Dec. 2019, pp. 1153-1160. https://doi.org/10.13067/JKIECS.2019.14.6.1153
  16. J. Jo, "Performance Comparison Analysis of AI Supervised Learning Methods of Tensorflow and Scikit-Learn in the Writing Digit Data," J. of the Korea Institute of Electronic Communication Sciences, vol. 14, no. 4, Aug. 2019, pp. 701-706. https://doi.org/10.13067/JKIECS.2019.14.4.701