DOI QR코드

DOI QR Code

Assembling three one-camera images for three-camera intersection classification

  • Marcella Astrid (Department of Artificial Intelligence, University of Science and Technology) ;
  • Seung-Ik Lee (Department of Artificial Intelligence, University of Science and Technology)
  • 투고 : 2023.03.18
  • 심사 : 2023.08.09
  • 발행 : 2023.10.20

초록

Determining whether an autonomous self-driving agent is in the middle of an intersection can be extremely difficult when relying on visual input taken from a single camera. In such a problem setting, a wider range of views is essential, which drives us to use three cameras positioned in the front, left, and right of an agent for better intersection recognition. However, collecting adequate training data with three cameras poses several practical difficulties; hence, we propose using data collected from one camera to train a three-camera model, which would enable us to more easily compile a variety of training data to endow our model with improved generalizability. In this work, we provide three separate fusion methods (feature, early, and late) of combining the information from three cameras. Extensive pedestrian-view intersection classification experiments show that our feature fusion model provides an area under the curve and F1-score of 82.00 and 46.48, respectively, which considerably outperforms contemporary three- and one-camera models.

키워드

과제정보

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (RS2023-00215760, Guide Dog: Development of Navigation AI Technology of a Guidance Robot for the Visually Impaired Person, 70%) and (No. 2019-0-01309, Development of AI Technology for Guidance of a Mobile Robot to its Goal with Uncertain Maps in Indoor/Outdoor Environments, 30%).

참고문헌

  1. M. Astrid, J.-H. Lee, M. Z. Zaheer, J.-Y. Lee, and S.-I. Lee, For safer navigation: pedestrian-view intersection classification, (International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea), 2020, pp. 7-10.
  2. A. L. Ballardini, A. H. Saz, and M. A. Sotelo, Model guided road intersection classification, (IEEE Intelligent Vehicles Symposium (iv), Nagoya, Japan), 2021, pp. 703-709.
  3. S. Kumaar, B. Navaneethakrishnan, S. Mannar, and S. N. Omkar, JuncNet: a deep neural network for road junction disambiguation for autonomous vehicles, arXiv preprint, 2018, arXiv:1809.01011.
  4. S. S. Mansouri, P. Karvelis, C. Kanellakis, A. Koval, and G. Nikolakopoulos, Visual subterranean junction recognition for MAVS based on convolutional neural networks, (IECON 2019-45th Annual Conference of the IEEE Industrial Electronics Society, Lisbon, Portugal), 2019, pp. 192-197.
  5. V. Tumen and B. Ergen, Intersections and crosswalk detection using deep learning and image processing techniques, Phys A: Stat. Mech. Appl. 543 (2020), 123510.
  6. T. Watanabe, K. Matsutani, M. Adachi, T. Oki, and R. Miyamoto, Feasibility study of intersection detection and recognition using a single shot image for robot navigation, J. Image Graph. 9 (2021), no. 2, 39-44.
  7. M. Astrid, M. Z. Zaheer, J.-Y. Lee, and S.-I. Lee, Domain-robust pedestrian-view intersection classification, (International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea), 2021, pp. 1087-1090.
  8. D. Bhatt, D. Sodhi, A. Pal, V. Balasubramanian, and M. Krishna, Have I reached the intersection: a deep learning-based approach for intersection detection from monocular cameras, (2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada), 2017, pp. 4495-4500.
  9. A. Fathi and J. Krumm, Detecting road intersections from GPS traces, International Conference on Geographic Information Science, Springer, 2010, pp. 56-69.
  10. A. Mueller, M. Himmelsbach, T. Luettel, F. Hundelshausen, and H.-J. Wuensche, GIS-based topological robot localization through LIDAR crossroad detection, (14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC, USA), 2011, pp. 2001-2008.
  11. Q. Zhu, L. Chen, Q. Li, M. Li, A. Nuchter, and J. Wang, 3D LIDAR point cloud based intersection recognition for autonomous driving, (IEEE Intelligent Vehicles Symposium, Madrid, Spain), 2012, pp. 456-461.
  12. U. Baumann, Y.-Y. Huang, C. Glaser, M. Herman, H. Banzhaf, and J. M. Zollner, Classifying road intersections using transferlearning on a deep neural network, (21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA), 2018, pp. 683-690.
  13. T. Koji and T. Kanji, Deep intersection classification using first and third person views, (2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France), 2019, pp. 454-459.
  14. P. Mukhija, S. Tourani, and K. M. Krishna, Outdoor intersection detection for autonomous exploration, (15th International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK, USA), 2012, pp. 218-223.
  15. Y. Nie, Q. Chen, T. Chen, Z. Sun, and B. Dai, Camera and LIDAR fusion for road intersection detection, (IEEE Symposium on Electrical & Electronics Engineering (EEESYM), Kuala Lumpur, Malaysia), 2012, pp. 273-276.
  16. A. Seff and J. Xiao, Learning from maps: visual common sense for autonomous driving, arXiv preprint, 2016, arXiv:1611.08583.
  17. D. Anguelov, C. Dulong, D. Filip, C. Frueh, S. Lafon, R. Lyon, A. Ogale, L. Vincent, and J. Weaver, Google Street View: capturing the world at street level, Computer 43 (2010), no. 6, 32-38. https://doi.org/10.1109/MC.2010.170
  18. Y. Bengio, F. Bastien, A. Bergeron, N. Boulanger-Lewandowski, T. Breuel, Y. Chherawala, M. Cisse, M. Cote, D. Erhan, and J. Eustache, Deep learners benefit more from out-ofdistribution examples, (Proceedings of the 14th International Conference on Artificial Intelligence and Statistics), 2011, pp. 164-172.
  19. A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, (Advances in Neural Information Processing Systems), 2012, pp. 1097-1105.
  20. A. Dabouei, S. Soleymani, F. Taherkhani, and N. M. Nasrabadi, SuperMix: supervising the mixing data augmentation, (Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA), 2021, pp. 13794-13803.
  21. J.-H. Lee, M. Z. Zaheer, M. Astrid, and S.-I. Lee, SmoothMix: a simple yet effective data augmentation to train robust classifiers, (Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA) 2020, pp. 756-757.
  22. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, mixup: beyond empirical risk minimization, (International Conference on Learning Representations), 2018.
  23. S. J. Pan and Q. Yang, A survey on transfer learning, IEEE Trans. Knowledge Data Eng. 22 (2009), no. 10, 1345-1359.
  24. K. Weiss, T. M. Khoshgoftaar, and D. Wang, A survey of transfer learning, J. Big Data 3 (2016), no. 1, 1-40. https://doi.org/10.1186/s40537-015-0036-x
  25. D. P. Kingma and J. Ba, Adam: a method for stochastic optimization, arXive preprint, 2014, arXiv:1412.6980.
  26. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, Pytorch: an imperative style, high-performance deep learning library, Advances in neural information processing systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alche-Buc, E. Fox, and R. Garnett, (eds.), Curran Associates, Inc., 2019, pp. 8024-8035.
  27. Y. Kim, D. Ra, and S. Lim, Zero-anaphora resolution in Korean based on deep language representation model: Bert, ETRI J. 43 (2021), no. 2, 299-312. https://doi.org/10.4218/etrij.2019-0441
  28. C. Park, Multi-task learning with contextual hierarchical attention for Korean coreference resolution, ETRI J. 45 (2023), no. 1, 93-104.
  29. A. K. Reyes, J. C. Caicedo, and J. E. Camargo, Fine-tuning deep convolutional networks for plant recognition, CLEF (Working Notes) 1391 (2015), 467-475.
  30. J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, How transferable are features in deep neural networks? (Proc. International Conference on Neural Information Processing Systems, Montreal, Canada), 2014, pp. 3320-3328.
  31. N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, On large-batch training for deep learning: generalization gap and sharp minima, (International Conference on Learning Representations), 2017.