DOI QR코드

DOI QR Code

Depth Image-Based Human Action Recognition Using Convolution Neural Network and Spatio-Temporal Templates

시공간 템플릿과 컨볼루션 신경망을 사용한 깊이 영상 기반의 사람 행동 인식

  • Eum, Hyukmin (Dept. of Electrical and Electronic Engineering, Yonsei University) ;
  • Yoon, Changyong (Dept. of Electrical Engineering, Suwon Science College)
  • Received : 2016.06.10
  • Accepted : 2016.09.01
  • Published : 2016.10.01

Abstract

In this paper, a method is proposed to recognize human actions as nonverbal expression; the proposed method is composed of two steps which are action representation and action recognition. First, MHI(Motion History Image) is used in the action representation step. This method includes segmentation based on depth information and generates spatio-temporal templates to describe actions. Second, CNN(Convolution Neural Network) which includes feature extraction and classification is employed in the action recognition step. It extracts convolution feature vectors and then uses a classifier to recognize actions. The recognition performance of the proposed method is demonstrated by comparing other action recognition methods in experimental results.

Keywords

References

  1. A. A. Chaaraoui, J. R. Padilla-Lopez, F. J. Ferrandez-Pastor, M. Nieto-Hidalgo, and F. Florez-Revuelta, "A Vision-Based System for Intelligent Monitoring: Human Behaviour Analysis and Privacy by Context," Sensors, vol. 14, no. 5, pp. 8895-8925, 2014. https://doi.org/10.3390/s140508895
  2. Z. Ren, J. Yuan, J. Meng, and Z. Zhang, "Robust partbased hand gesture recognition using kinect sensor," IEEE Trans. Multimed, vol. 15, no. 5, pp. 1110-1120, 2013. https://doi.org/10.1109/TMM.2013.2246148
  3. J. B. Kim, and H. J. Kim, "Model Based Gaze Direction Estimation Using Support Vector Machine", The Proceedings of Korean Institute of Electrical Engineers (KIEE) pp. 121-122, 2007. 10
  4. D. Tao, X. Li, X. Wu, and S. J. Maybank, "General tensor discriminant analysis and gabor features for gait recognition," IEEE Trans. Pattern Anal. Mach. Intell, vol. 29, no. 10, pp. 1700-1715, 2007. https://doi.org/10.1109/TPAMI.2007.1096
  5. F. Lv and R. Nevatia, "Single view human action recognition using key pose matching and viterbi path searching," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Minneapolis, pp. 1-8, 2007.
  6. H. Eum, C. Yoon, H. Lee, and M. Park, "Continuous Human Action Recognition Using Depth-MHI-HOG and a Spotter Model," Sensors, vol. 15, no. 3, pp. 5197-5227, 2015. https://doi.org/10.3390/s150305197
  7. O. D. Lara and M. A. Labrador, "A survey on human activity recognition using wearable sensors," Communications Surveys & Tutorials, IEEE, vol. 15, no. 3, pp. 1192-1209, 2013. https://doi.org/10.1109/SURV.2012.110112.00192
  8. S. Vishwakarma and A. Agrawal, "A survey on activity recognition and behavior understanding in video surveillance," The Visual Computer, vol. 29, no. 10, pp. 983-1009, 2013. https://doi.org/10.1007/s00371-012-0752-6
  9. J. Wang, Z. Liu, and Y. Wu, "Learning actionlet ensemble for 3D human action recognition," in Human Action Recognition with Depth Cameras, ed: Springer, 2014, pp. 11-40.
  10. M. A. Ahad, "Motion History Image," in Motion History Images for Action Recognition and Understanding, ed: Springer, 2013, pp. 31-76.
  11. R. Poppe, "A survey on vision-based human action recognition," Image and Vision computing, vol. 28, no. 6, pp. 976-990, 2010. https://doi.org/10.1016/j.imavis.2009.11.014
  12. X. Wu, D. Xu, L. Duan, J. Luo, and Y. Jia, "Action Recognition Using Multilevel Features and Latent Structural SVM," IEEE Trans. Circuits Syst. Video Techn., vol. 23, no. 8, pp. 1422-1431, 2013. https://doi.org/10.1109/TCSVT.2013.2244794
  13. K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 37, no. 9, pp. 1904-1916, 2015. https://doi.org/10.1109/TPAMI.2015.2389824
  14. Y. Zhou and N.-M. Cheung, "Vehicle Classification using Transferable Deep Neural Network Features," arXiv preprint arXiv: 1601.01145, 2016.
  15. L. Xia, C.-C. Chen, and J. Aggarwal, "Human detection using depth information by kinect," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Colorado Springs, pp. 15-22, 2011.
  16. M. A. R. Ahad, J. Tan, H. Kim, and S. Ishikawa, "Motion history image: its variants and applications," Machine Vision and Applications, vol. 23, no. 2, pp. 255-281, 2012. https://doi.org/10.1007/s00138-010-0298-4
  17. A. Vedaldi and K. Lenc, "MatConvNet: Convolutional neural networks for matlab," in Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pp. 689-692, 2015.
  18. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, pp. 886-893, 2005.