DOI QR코드

DOI QR Code

Convolutional neural network based traffic sound classification robust to environmental noise

합성곱 신경망 기반 환경잡음에 강인한 교통 소음 분류 모델

  • Received : 2018.09.20
  • Accepted : 2018.11.22
  • Published : 2018.11.30

Abstract

As urban population increases, research on urban environmental noise is getting more attention. In this study, we classify the abnormal noise occurring in traffic situation by using a deep learning algorithm which shows high performance in recent environmental noise classification studies. Specifically, we classify the four classes of tire skidding sounds, car crash sounds, car horn sounds, and normal sounds using convolutional neural networks. In addition, we add three environmental noises, including rain, wind and crowd noises, to our training data so that the classification model is more robust in real traffic situation with environmental noises. Experimental results show that the proposed traffic sound classification model achieves better performance than the existing algorithms, particularly under harsh conditions with environmental noises.

도시 유동인구가 증가함에 따라 도시 환경 소음에 관한 연구의 중요성이 증가하고 있다. 본 연구에서는 교통상황에서 발생하는 이상 소음을 최근 환경 소음 분류 연구에서 높은 성능을 보이는 딥러닝 알고리즘을 이용하여 분류한다. 구체적으로는 타이어 제동 마찰음, 자동차 충돌음, 자동차 경적음, 정상 소음 네 개의 클래스에 대하여 합성곱 신경망을 이용하여 분류한다. 또한, 실제 교통 상황에서의 환경잡음에 강인한 분류 성능을 갖기 위해 빗소리, 바람 소리, 군중 소리의 세 가지 환경잡음을 설정하였고 이를 활용하여 분류 모델을 설계하였으며 3 dB SNR(Signal to Noise Ratio) 조건에서 88 % 이상의 분류 성능을 가진다. 제시한 교통 소음에 대하여 기존 선행연구 대비 높은 분류 성능을 보이고, 빗소리, 바람 소리, 군중 소리의 세 가지 환경잡음에 강인한 교통 소음 분류 모델을 제안한다.

Keywords

GOHHBH_2018_v37n6_469_f0001.png 이미지

Fig. 1. Network structure of the proposed model.

GOHHBH_2018_v37n6_469_f0002.png 이미지

Fig. 2. Confusion matrix for four-class classification of proposed model trained with clean data and tested with clean data (Table 3 clean train / clean test). Overall classification accuracy is 94.4 %.

GOHHBH_2018_v37n6_469_f0003.png 이미지

Fig. 3. Confusion matrix for four-class classification of proposed model trained with noise data and tested with noise data with SNR 10 dB [Table 3 noise train / noise test (10 dB)]. Overall classification accuracy is 90.7 %.

Table 1. Data distribution of 4 classes. TS, CC, CH, and NS correspond to tire skidding, car crash, car horn, and normal sounds.

GOHHBH_2018_v37n6_469_t0001.png 이미지

Table 2. Results of the two-class classification of the proposed model and baseline (the classification accuracy of normal sound is not reported in[2]).

GOHHBH_2018_v37n6_469_t0002.png 이미지

Table 3. Classification results according to training and test data composition. TOT means overall classification accuracy.

GOHHBH_2018_v37n6_469_t0003.png 이미지

References

  1. R. Banerjee, A. Sinha, and A. Saha, "Participatory sensing based traffic condition monitoring using horn detection," Proc. the 28th annual ACM symposium on applied computing, 567-569 (2013).
  2. P. Foggia, P. Foggia, N. Petkov, A. Saggese, N. Stisciuglio, and M. Vento, "Audio surveillance of roads: A system for detecting anomalous sounds," IEEE trans. of intelligent transportation systems 17, 279-288 (2016). https://doi.org/10.1109/TITS.2015.2470216
  3. M. Cristani, M. Bicego, and V. Murino, "Audio-visual event recognition in surveillance video sequences," IEEE Trans. Multimedia, 9, 257-267 (2007). https://doi.org/10.1109/TMM.2006.886263
  4. K. J. Piczak, "Environmental sound classification with convolutional neural networks," IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 1-6 (2015).
  5. J. Salamon and J. P. Bello, "Deep convolutional neural networks and data augmentation for environmental sound classification," IEEE Signal Processing Letters, 24, 279-283 (2017). https://doi.org/10.1109/LSP.2017.2657381
  6. J. Salamon, C. Jacoby, and J. P. Bello, "A dataset and taxonomy for urban sound research," Proc. the 22nd ACM international conference on Multimedia, 1041-1044 (2014).
  7. http://www.freesound.org
  8. B. McFee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, and O. Nieto, "librosa: Audio and music signal analysis in python," Proc. the 14th Python in Science Conference, 18-25 (2015).
  9. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, "Tensorflow: a system for large-scale machine learning," Proc. the 12th USENIX conference on OSDI, 16, 265-283 (2016).