DOI QR코드

DOI QR Code

객체 검출을 위한 통계치 적응적인 선형 회귀 기반 객체 크기 예측

Object Size Prediction based on Statistics Adaptive Linear Regression for Object Detection

  • 투고 : 2020.12.17
  • 심사 : 2021.03.12
  • 발행 : 2021.03.30

초록

본 논문은 객체 검출 알고리즘을 위한 통계치 적응적인 선형 회귀 기반 객체 크기 예측 방법을 제안한다. 기존에 제안된 딥 러닝 기반 객체 검출 알고리즘 중 YOLOv2 및 YOLOv3은 객체의 크기를 예측하기 위하여 네트워크의 마지막 계층에 통계치 적응적인 지수 회귀 모델을 사용한다. 하지만, 지수 회귀 모델은 역전파 과정에서 지수 함수의 특성상 매우 큰 미분값을 네트워크의 파라미터로 전파시킬 수 있는 문제점이 있다. 따라서 본 논문에서는 미분 값의 발산 문제를 해결하기 위하여 객체 크기 예측을 위한 통계치 적응적인 선형 회귀 모델을 제안한다. 제안하는 통계치 적응적인 선형 회귀 모델은 딥러닝 네트워크의 마지막 계층에 사용되며, 학습 데이터셋에 존재하는 객체들의 크기에 대한 통계치를 이용하여 객체의 크기를 예측한다. 제안하는 방법의 성능 평가를 위하여 YOLOv3 tiny를 기반으로 제안하는 방법을 적용하여 재설계한 네트워크의 검출 성능과 YOLOv3 tiny의 검출 성능을 비교하였으며, 성능 비교를 위한 데이터셋으로는 UFPR-ALPR 데이터셋을 사용하였다. 실험을 통해 제안하는 방법의 우수성을 검증하였다.

This paper proposes statistics adaptive linear regression-based object size prediction method for object detection. YOLOv2 and YOLOv3, which are typical deep learning-based object detection algorithms, designed the last layer of a network using statistics adaptive exponential regression model to predict the size of objects. However, an exponential regression model can propagate a high derivative of a loss function into all parameters in a network because of the property of an exponential function. We propose statistics adaptive linear regression layer to ease the gradient exploding problem of the exponential regression model. The proposed statistics adaptive linear regression model is used in the last layer of the network to predict the size of objects with statistics estimated from training dataset. We newly designed the network based on the YOLOv3tiny and it shows the higher performance compared to YOLOv3 tiny on the UFPR-ALPR dataset.

키워드

참고문헌

  1. Masi, Y. Wu, T. Hassner, and P. Natarajan, "Deep Face Recognition: A Survey," in SIBGRAPI Conference on Graphics, Patterns and Images, Parana, 2018.
  2. E. Arnold, O. Y. Al-Jarrah, M. Dianati, S. Fallah, D. Oxtoby, and A. Mouzakitis, "A Survey on 3D Object Detection Methods for Autonomous Driving Applications," IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 10, pp. 3782-3795, 2019. https://doi.org/10.1109/tits.2019.2892405
  3. OpenALPR, [Online]. Available: https://www.openalpr.com/
  4. W. Xiongwei, D. Sahoo, and S. C. H. Hoi, "Recent advances in deep learning for object detection," Neurocomputing, vol. 396, pp. 39-64, 2020. https://doi.org/10.1016/j.neucom.2020.01.085
  5. M. Everingham, L. V. Gool, C. K. I. Williams, J. M. Winn, and A. Zisserman, "The PASCAL Visual Object Classes (VOC) Challenge," International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010. https://doi.org/10.1007/s11263-009-0275-4
  6. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, "Microsoft COCO: Common Objects in Context," in European Computer Vision Conference, Zurich, 2014.
  7. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014.
  8. R. Girshick, "Fast R-CNN," in IEEE International Conference on Computer Vision, Santiago, 2015.
  9. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
  10. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016.
  11. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," in European Conference on Computer Vision, Amsterdam, 2016.
  12. J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
  13. J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv, 2018.
  14. R. Laroca, E. Severo, L. A. Zanlorensi, L. S. Oliveira, G. R. Goncalves, W. R. Schwartz, and D. Menotti, "A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector," in International Joint Conference on Neural Networks, Rio de Janeiro, 2018.
  15. Z. Tian, C. Shen, H. Chen, and T. He, "FCOS: Fully Convolutional One-Stage Object Detection," in IEEE/CVF International Conference on Computer Vision, Seoul, 2019.
  16. D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, "YOLACT: Real-Time Instance Segmentation," in IEEE/CVF International Conference on Computer Vision, Seoul, 2019.
  17. T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and B. Serge, "Feature Pyramid Networks for Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
  18. yolov3 tiny, [Online]. Available: https://github.com/pjreddie/darknet/blob/master/cfg/yolov3-tiny.cfg.
  19. D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  20. P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, 2001.
  21. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, 2005.
  22. H. Bay, T. Tuytelaars, and L. V. Gool, "SURF: Speeded Up Robust Features," in European Conference on Computer Vision, Graz, 2006.
  23. C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995. https://doi.org/10.1007/BF00994018
  24. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in Advances in Neural Information Processing Systems, Lake Tahoe, 2012.
  25. K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv, 2014.
  26. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016.
  27. G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
  28. Python, [Online]. Available: https://www.python.org/
  29. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An Imperative Style, High-Performance Deep Learning Library," in Advances in Neural Information Processing Systems, Vancouver, 2019.
  30. S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," in International Conference on Machine Learning, Lille, 2015.