DOI QR코드

DOI QR Code

Object Detection on the Road Environment Using Attention Module-based Lightweight Mask R-CNN

주의 모듈 기반 Mask R-CNN 경량화 모델을 이용한 도로 환경 내 객체 검출 방법

  • Song, Minsoo (Department of Electrical and Electronics Engineering, Konkuk University) ;
  • Kim, Wonjun (Department of Electrical and Electronics Engineering, Konkuk University) ;
  • Jang, Rae-Young (Research Data Sharing Center, Korea Institute of Science and Technology Information) ;
  • Lee, Ryong (Research Data Sharing Center, Korea Institute of Science and Technology Information) ;
  • Park, Min-Woo (Research Data Sharing Center, Korea Institute of Science and Technology Information) ;
  • Lee, Sang-Hwan (Research Data Sharing Center, Korea Institute of Science and Technology Information) ;
  • Choi, Myung-seok (Research Data Sharing Center, Korea Institute of Science and Technology Information)
  • 송민수 (건국대학교 전기전자공학부) ;
  • 김원준 (건국대학교 전기전자공학부) ;
  • 장래영 (한국과학기술정보연구원 연구데이터공유센터) ;
  • 이용 (한국과학기술정보연구원 연구데이터공유센터) ;
  • 박민우 (한국과학기술정보연구원 연구데이터공유센터) ;
  • 이상환 (한국과학기술정보연구원 연구데이터공유센터) ;
  • 최명석 (한국과학기술정보연구원 연구데이터공유센터)
  • Received : 2020.06.22
  • Accepted : 2020.10.07
  • Published : 2020.11.30

Abstract

Object detection plays a crucial role in a self-driving system. With the advances of image recognition based on deep convolutional neural networks, researches on object detection have been actively explored. In this paper, we proposed a lightweight model of the mask R-CNN, which has been most widely used for object detection, to efficiently predict location and shape of various objects on the road environment. Furthermore, feature maps are adaptively re-calibrated to improve the detection performance by applying an attention module to the neural network layer that plays different roles within the mask R-CNN. Various experimental results for real driving scenes demonstrate that the proposed method is able to maintain the high detection performance with significantly reduced network parameters.

객체 검출 알고리즘은 자율주행 시스템 구현을 위한 핵심 요소이다. 최근 심층 합성곱 신경망 (Deep Convolutional Neural Network) 기반의 영상 인식 기술이 발전함에 따라 심층 학습을 이용한 객체 검출 관련 연구들이 활발히 진행되고 있다. 본 논문에서는 객체 검출에 가장 널리 사용되고 있는 Mask R-CNN의 경량화 모델을 제안하여 도로 내 다양한 객체들의 위치와 형태를 효율적으로 예측하는 방법을 제안한다. 또한, 주의 모듈(Attention Module)을 Mask R-CNN 내 각각 다른 역할을 수행하는 신경망 계층에 적용함으로써 특징 지도를 적응적으로 재교정(Re-calibration)하여 검출 성능을 향상시킨다. 실제 주행 영상에 대한 다양한 실험 결과를 통해 제안하는 방법이 기존 방법 대비 크게 감소된 신경망 매개변수만을 이용하여 고성능 검출 성능을 유지함을 보인다.

Keywords

Acknowledgement

본 연구는 한국과학기술정보연구원(KISTI) "연구데이터 공유 확산체제 구축(K-20-L01-C04-S01)" 과제의 위탁연구로 수행한 것입니다.

References

  1. K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," in Proc. IEEE International Conference on Computer Vision(ICCV), pp. 2980-2988, Oct. 2017.
  2. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified real-time object detection," in Proc. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 779-788, Jun. 2016.
  3. M. Tan and Q. Le, "EfficientNet: Rethinking model scaling for convolutional neural networks," in Proc. 36th International Conference on Machine Learning, pp. 6105-6114, Jun. 2019.
  4. M. Tan, R. Pang, and Q. V. Le, "EfficientDet: Scalable and efficient object detection," 2019, arXiv:1911.09070. [Online]. Available: http://arxiv.org/abs/1911.09070
  5. J. Hu, L. Shen, and G. Sun, "Squeeze-and-Excitation Networks," in Proc. IEEE Conference on Computer Vision Pattern Recognition (CVPR), pp. 7132-7141, Jun. 2018.
  6. T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and L. Zitnick. "Microsoft coco: Common objects in context." in Proc. European Conference on Computer Vision (ECCV), pp. 740-755, 2014.
  7. A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. Devito, Z. Lin, A. Desmaison, L. Antiga, A. Lerer, "Automatic differentiation in pytorch". in Proc. Conference and Workshop on Neural Information Processing Systems (NIPS), pp. 1-4, 2017.
  8. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "ImageNet large scale visual recognition challenge," International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211-252, Dec. 2015. https://doi.org/10.1007/s11263-015-0816-y
  9. K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification," in Proc. IEEE International Conference on Computer Vision (ICCV), pp. 1026-1034, Dec. 2015.
  10. Y. Wu and K. He, "Group normalization," in Proc. European Conference on Computer Vision (ECCV), pp. 3-19, Sep. 2018.
  11. I. Loshchilov and F. Hutter, "Decoupled weight decay regularization," arXiv:1711.05101. [Online]. Available: https://arxiv.org/abs/1711.05101, 2017.
  12. Y. Li, H. Qi, J. Dai, X. Ji, and Y. Wei. "Fully convolutional instance-aware semantic segmentation," in Proc. IEEE International Conference on Computer Vision Pattern Recognition (CVPR), pp. 4438-4446, Jul. 2017.