DOI QR코드

DOI QR Code

2D and 3D Hand Pose Estimation Based on Skip Connection Form

스킵 연결 형태 기반의 손 관절 2D 및 3D 검출 기법

  • Ku, Jong-Hoe (Department of Computer Engineering, Pusan National University) ;
  • Kim, Mi-Kyung (Software Education Center, Pusan National University) ;
  • Cha, Eui-Young (Department of Computer Engineering, Pusan National University)
  • Received : 2020.09.08
  • Accepted : 2020.09.18
  • Published : 2020.12.31

Abstract

Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.

기존의 신체 인식 방법은 특수한 기기를 사용하거나 이미지로부터 영상처리를 통해 검출하는 방법들이 있다. 특수 기기를 사용할 경우 기기를 사용할 수 있는 환경이 제약되고 기기의 비용이 많이 든다는 단점이 있다. 카메라와 영상처리를 사용할 경우 환경의 제약과 비용이 낮아지는 장점이 있지만, 성능이 떨어진다. 이런 단점을 해결하기 위해 카메라와 합성 곱 심층 신경망을 사용한 신체 인식 방법들이 연구되었다. 합성 곱 심층 신경망의 성능을 올리기 위해 다양한 기법들이 제안되었다. 본 논문에서는 합성 곱 심층 신경망의 성능을 올리기 위한 기법 중 스킵 연결을 다양한 형태로 사용하여 스킵 연결이 손 검출 망에 끼치는 영향을 실험하였다. 실험을 통해 기본 스킵 연결 이외 추가적인 스킵 연결의 존재가 성능에 나은 영향을 끼치고 하향 스킵 연결만 추가된 망이 가장 나은 성능을 보임을 확인하였다.

Keywords

References

  1. F. Mueller, F. Bernard, O. Sotnychenko, D. Mehta, S. Sridhar, D. Casas, and C. Theobalt, "Ganerated hands for real-time 3d hand tracking from monocular rgb," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49-59, 2018.
  2. L.Ge, Y. Cai, J. Weng, and J. Yuan, "Hand pointnet: 3d hand pose estimation using point sets," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8417-8426, 2018.
  3. Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, "Realtime multi-person 2D pose estimation using part affinity fields," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291-7299, 2017.
  4. K. Sun, B. Xiao, D. Liu, and J. Wang, "Deep high-resolution representation learning for human pose estimation," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5693-5703, 2019.
  5. O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," In International Conference on Medical image computing and computerassisted interventio. Springer, Cham, pp. 234-241, 2015.
  6. S. Woo, J. Park, J. Y. Lee, and I. So Kweon, "Cbam: Convolutional block attention module," In Proceedings of the European conference on computer vision (ECCV), pp. 3-19, 2018.
  7. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
  8. F. Xiong, B. Zhang, Y. Xiao, Z. Cao, T. Yu, J. T. Zhou, and J. Yuan, "A2j: Anchor-to-joint regression network for 3d articulated pose estimation from a single depth image." In Proceedings of the IEEE International Conference on Computer Vision, pp. 793-802, 2019.
  9. F. Gomez-Donoso, S. Orts-Escolano, and M. Cazorla, "Large-scale multiview 3d hand pose dataset," Image and Vision Computing, vol. 81, 25-33, 2019. https://doi.org/10.1016/j.imavis.2018.12.001
  10. T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
  11. X. Zhou, D. Wang, and P. Krahenbuhl, "Objects as points," arXiv preprint arXiv:1904.07850, 2019.
  12. G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," In Proceedings of the IEEE conference on computer vision and pattern recognitio, pp. 4700-4708, 2017.