DOI QR코드

DOI QR Code

Data Augmentation for Tomato Detection and Pose Estimation

토마토 위치 및 자세 추정을 위한 데이터 증대기법

  • Jang, Minho (Chungbuk National University Dept. of Biosystems Engineering) ;
  • Hwang, Youngbae (Chungbuk National University Dept. of Intelligent Systems &Robotics)
  • 장민호 (충북대학교 바이오시스템공학과) ;
  • 황영배 (충북대학교 지능로봇공학과)
  • Received : 2021.11.25
  • Accepted : 2022.01.14
  • Published : 2022.01.30

Abstract

In order to automatically provide information on fruits in agricultural related broadcasting contents, instance image segmentation of target fruits is required. In addition, the information on the 3D pose of the corresponding fruit may be meaningfully used. This paper represents research that provides information about tomatoes in video content. A large amount of data is required to learn the instance segmentation, but it is difficult to obtain sufficient training data. Therefore, the training data is generated through a data augmentation technique based on a small amount of real images. Compared to the result using only the real images, it is shown that the detection performance is improved as a result of learning through the synthesized image created by separating the foreground and background. As a result of learning augmented images using images created using conventional image pre-processing techniques, it was shown that higher performance was obtained than synthetic images in which foreground and background were separated. To estimate the pose from the result of object detection, a point cloud was obtained using an RGB-D camera. Then, cylinder fitting based on least square minimization is performed, and the tomato pose is estimated through the axial direction of the cylinder. We show that the results of detection, instance image segmentation, and cylinder fitting of a target object effectively through various experiments.

농업 관련 방송 콘텐츠에서 과일에 대한 자동적인 정보 제공을 위해서 대상 과일의 인스턴스 영상 분할이 요구된다. 또한, 해당 과일에 대한 3차원 자세에 대한 정보 제공도 의미있게 사용될 수 있다. 본 논문에서는 영상 콘텐츠에서 토마토에 대한 정보를 제공하는 연구를 다룬다. 인스턴스 영상 분할 기법을 학습하기 위해서는 다량의 데이터가 필요하지만 충분한 토마토 학습데이터를 얻기는 힘들다. 따라서 적은 양의 실사 영상을 바탕으로 데이터 증대기법을 통해 학습 데이터를 생성하였다. 실사 영상만을 통한 학습 결과 정확도에 비해서, 전경과 배경을 분리해서 만들어진 합성 영상을 통해 학습한 결과, 기존 대비 성능이 향상되는 것을 확인하였다. 영상 전처리 기법들을 활용해서 만들어진 영상을 사용한 데이터 증대 영상의 학습 결과, 전경과 배경을 분리한 합성 영상보다 높은 성능을 얻는 것을 확인하였다. 객체 검출 후 자세 추정을 하기 위해 RGB-D 카메라를 이용하여 포인트 클라우드를 획득하였고 최소제곱법을 이용한 실린더 피팅을 진행하였고, 실린더의 축 방향을 통해 토마토 자세를 추정하였다. 우리는 다양한 실험을 통해서 대상 객체에 대한 검출, 인스턴스 영상 분할, 실린더 피팅의 결과가 의미있게 나타난다는 것을 보였다.

Keywords

Acknowledgement

이 논문은 충북대학교 국립대학육성사업(2020)지원을 받아 작성되었음.

References

  1. Jongseo Lee, Mangyu Kim, and Hakil Kim, "Camera and LiDAR Sensor Fusion for Improving Object Detection", JBE Vol. 24, No. 4, July 2019. https://doi.org/10.5909/JBE.2019.24.4.580
  2. Jinbae Park, Teerath Kumar, and Sung-Ho Bae, "Search for Optimal Data Augmentation Policy for Environmental Sound Classification with Deep Neural Networks", JBE Vol. 25, No. 6, November 2020. https://doi.org/10.5909/JBE.2020.25.6.854
  3. Y. Onishi, T. Yoshida, H. Kurita, T. Fukao, H. Arihara, and A. Iwai, "An automated fruit harvesting robot by using deep learning." Robomech Journal, Vol. 6, No.13, November 2019.
  4. K. I-His, H. Ya-Wen, Y. Ya-Zhu, C. Ya-Li, L. Yi-Horng, and P. Jau-Woei, "Determination of Lycopersicon maturity using convolustional autoencoders", Scientia Horticulturae, Vol. 256, No.108538, October 2019.
  5. A. Elhassouny, and F. Smarandache, "Smart mobile application to recognize tomato leaf diseases using convolutional Neural Networks," International Conference of Computer Science and Renewable Energies, pp. 1-4, 2019.
  6. K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN." In ICCV, pp.2961-2969, 2017.
  7. Y. Yu, K. Zhang, L. Yang, and D. Zhang, "Fruit detection for stawberry harvesting robot in non-structural environment based on mask-RCNN." Comput Electron Agricult, Vol. 163, No.104846, June 2019.
  8. S. Gonzalez, C. Arellano, and J. E.Tapia, "Deepblueberry: Quantification of blueberries in the wild using instance segmentation." IEEE Access, Vol. 7, pp. 105776-105788, August 2019. https://doi.org/10.1109/access.2019.2933062
  9. W. Yin, H. Wen, Z. Ning, J. Ye, Z. Dong, and L. Luo, "Fruit Detection and Pose Estimation for Grape Cluster-Harvesting Robot Using Binocular Imagery Based on Deep Neural Networks." Frontiers in Robotics and AI, Vol. 8, No.626989, June 2021.
  10. N. Wagner, R. Kirk, M. Hanheide, and G. Cielniak, "Efficient and Robust Orientation Estimation of Strawberries for Fruit Picking Applications", IEEE International Conference on Robotics and Automation, pp. 13857-13863, May, 2021.
  11. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting" The journal of machine learning research, Vol. 15, No.1, pp. 1929-1958, June 2014.
  12. T. DeVries, and G. W. Taylor, "Improved regularization of convolutional neural networks with cutout" arXiv, Vol. 1708, No.04552, 2017.
  13. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random Erasing Data Augmentation" Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No.07, pp. 13001-13008. 2020.
  14. S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: Regularization strategy to train strong classifiers with localizable features" In Proceedings of the IEEE/CVF International Conference on Computer Vision pp. 6023-6032, 2019.
  15. Y. Ge, Y. Xiong, G. L. Tenorio, and P. J. From, "Fruit localization and environment perception for strawberry harvesting robots" IEEE Access, Vol. 7, pp. 147642-147652, October 2019. https://doi.org/10.1109/access.2019.2946369
  16. N. Guo, B. Zhang, J. Zhou, K. Zhan, and S. Lai, "Pose estimation and adaptable grasp configuration with point cloud registration and geometry understanding for fruit grasp planning" Computers and Electronics in Agriculture, Vol. 179, pp. 105818, December 2020. https://doi.org/10.1016/j.compag.2020.105818
  17. G. Lin, Y. Tang, X. Zou, J. Xiong, and J. Li, "Guava detection and pose estimation using a low-cost RGB-D sensor in the field" Sensors, Vol. 19, No.2, pp. 428. January 2019. https://doi.org/10.3390/s19020428
  18. H. Li, Q. Zhu, M. Huang, Y. Guo, and J. Qin, "Pose estimation of sweet pepper through symmetry axis detection" Sensors, Vol. 18 No.9, pp. 3083, September 2018. https://doi.org/10.3390/s18093083
  19. J. Kim, J. Kim, H. Son, "Development of Deep Learning-based Tomato Detection and Manipulator Control System for Tomato Harvesting Robot", Institute of Control, Robotics and Systems, pp. 525-526, 2020.
  20. W. Lee, K. Ko, J. Kang, H. Park, I. Jang, "Instance Segmentation based Recognition System Tracking Tomatoes by Ripeness in Natural Light Conditions", Journal of Institute of Control, Robotics and Systems, vol. 26, no. 11, pp. 940-948, 2020. https://doi.org/10.5302/j.icros.2020.20.0129
  21. A. Kelly, cocosynth, 2019, https://github.com/akTwelve/cocosynth.git.
  22. AI Hub, Agricultural knowledge base, 2018, https://aihub.or.kr/aidata/129.
  23. Kaggle, Tomato Detection, 2020, https://www.kaggle.com/andrewmvd/tomato-detection
  24. A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, "Albumentations: fast and flexible image augmentations" Information, Vol. 11, No.2, pp. 125, February 2020. https://doi.org/10.3390/info11020125
  25. T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, "Microsoft COCO: Common objects in context" In European conference on computer vision, pp. 740-755, 2014.
  26. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition" Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
  27. Laboro, Laboro Tomato, 2020, https://github.com/laboroai/LaboroTomato.git
  28. Q. Zhou, Y. Park, and V. Koltun, "Open3D: A modern library for 3D data processing" arXiv, Vol. 1801, No.09847 2018.
  29. X. Pan, cylinder_fitting, 2017, https://github.com/xingjiepan/cylinder_fitting.git