DOI QR코드

DOI QR Code

Human Activities Recognition Based on Skeleton Information via Sparse Representation

  • Liu, Suolan (School of Information Science and Engineering, Changzhou University) ;
  • Kong, Lizhi (School of Materials Science and Engineering, Changzhou University) ;
  • Wang, Hongyuan (School of Information Science and Engineering, Changzhou University)
  • Received : 2017.04.25
  • Accepted : 2018.01.08
  • Published : 2018.03.30

Abstract

Human activities recognition is a challenging task due to its complexity of human movements and the variety performed by different subjects for the same action. This paper presents a recognition algorithm by using skeleton information generated from depth maps. Concatenating motion features and temporal constraint feature produces feature vector. Reducing dictionary scale proposes an improved fast classifier based on sparse representation. The developed method is shown to be effective by recognizing different activities on the UTD-MHAD dataset. Comparison results indicate superior performance of our method over some existing methods.

Keywords

Acknowledgement

Supported by : National Natural Science Foundations of China

References

  1. R. Slama, H. Wannous, M. Daoudi, and A. Srivastava, "Accurate 3D action recognition using learning on the Grassmann manifold," Pattern Recognition, vol. 48, no. 2, pp. 556-567, 2014. https://doi.org/10.1016/j.patcog.2014.08.011
  2. J. Han, L. Shao, D. Xu, and J. Shotton, "Enhanced computer vision with Microsoft Kinect sensor: a review," IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1318-1334, 2013. https://doi.org/10.1109/TCYB.2013.2265378
  3. R. Messing, C. Pal, and H. Kautz, "Activity recognition using the velocity histories of tracked keypoints," in Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 2009, pp. 104-111.
  4. L. Shao and X Zhen, "Spatio-temporal Laplacian pyramid coding for action recognition," IEEE Transactions on Cybernetics, vol. 44, no. 6, pp. 817-827, 2014. https://doi.org/10.1109/TCYB.2013.2273174
  5. I. Laptev, B. Caputo, C. Schuldt, and T. Lindeberg, "Local velocity-adapted motion events for spatio-temporal recognition," Computer Vision and Image Understanding, vol. 108, no. 3, pp. 207-229, 2007. https://doi.org/10.1016/j.cviu.2006.11.023
  6. R. Qiao, L. Liu, C. Shen, and A. van den Hengel, "Learning discriminative trajectorylet detector sets for accurate skeletonbased action recognition," Pattern Recognition, vol. 66, pp. 202-212, 2017. https://doi.org/10.1016/j.patcog.2017.01.015
  7. J. Zhang, W. Li, P. O. Ogunbona, P. Wang, and C. Tang, "RGB-D-based action recognition datasets: a survey," Pattern Recognition, vol. 60, pp. 86-105, 2016. https://doi.org/10.1016/j.patcog.2016.05.019
  8. D. K. Vishwakarma, R. Kapoor, and A. Dhiman, "A proposed unified framework for the recognition of human activity by exploiting the characteristics of action dynamics," Robotics and Autonomous Systems, vol. 77, pp. 25-38, 2016. https://doi.org/10.1016/j.robot.2015.11.013
  9. N. Ashraf, C. Sun, and H Foroosh, "View invariant action recognition using projective depth," Computer Vision and Image Understanding, vol. 123, pp. 41-52, 2014. https://doi.org/10.1016/j.cviu.2014.03.005
  10. X. Ji, J. Cheng, D. Tao, X. Wu, and W. Feng, "The spatial Laplacian and temporal energy pyramid representation for human action recognition using depth sequences," Knowledge-Based Systems, vol. 122, pp. 64-74, 2017. https://doi.org/10.1016/j.knosys.2017.01.035
  11. L. Xia, C. C. Chen, and J. K. Aggarwal, "View invariant human action recognition using histograms of 3d joints," in Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Providence, RI, 2012, pp. 20-27.
  12. R. Vemulapalli, F. Arrate, and R. Chellappa, "Human action recognition by representing 3D skeletons as points in a lie group," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 588-595.
  13. I. Lillo, J. C. Niebles, and A. Soto, "Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos," Image and Vision Computing, vol. 59, pp. 63-75, 2017. https://doi.org/10.1016/j.imavis.2016.11.004
  14. P. Felzenszwalb, D. McAllester, and D. Ramanan, "A discriminatively trained, multiscale, deformable part model," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, 2008, pp. 1-8.
  15. S. Gao, I. W. H. Tsang, and L. T. Chia, "Kernel sparse representation for image classification and face recognition," in Proceedings of the 11th European Conference on Computer Vision, Crete, Greece, 2010, pp. 1-14.
  16. M. Devanne, H., Wannous, S. Berretti, P. Pala, M. Daoudi, and A. Del Bimbo, "Space-time pose representation for 3D human action recognition," in International Conference on Image Analysis and Processing: New Trends in Image Analysis and Processing, Naples, Italy, 2013, pp. 456-464.
  17. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, "Real-time human pose recognition in parts from single depth images," in Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, 2011, pp. 1297-1304.
  18. Z. Huang, C. Wan, T. Probst, and L. Van Gool, "Deep learning on lie groups for skeleton-based action recognition," in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 2017, pp. 6099-6108.
  19. X. Chen and M. Koskela, "Skeleton-based action recognition with extreme learning machines," Neurocomputing, vol. 149, pp. 387-396, 2015. https://doi.org/10.1016/j.neucom.2013.10.046
  20. G. Zhu and L. Cao, "Human motion recognition based on skeletal information of Kinect Sensor," Computer Simulation, vol. 31, no. 12, pp. 329-345, 2014.
  21. D. C. Luvizon, H. Tabia, and D. Picard, "Learning features combination for human action recognition from skeleton sequences," Pattern Recognition Letters, vol. 99, pp. 13-20, 2017. https://doi.org/10.1016/j.patrec.2017.02.001
  22. F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy, "Sequence of the most informative joints (smij): a new representation for human skeletal action recognition," Journal of Visual Communication and Image Representation, vol. 25, no. 1, pp. 24-38, 2014. https://doi.org/10.1016/j.jvcir.2013.04.007
  23. F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy, "Berkeley MHAD: a comprehensive multimodal human action database," in Proceedings of 2013 IEEE Workshop on Applications of Computer Vision, Tampa, FL, 2013, pp. 53-60.
  24. R. Qiao, L. Liu, C. Shen, and A. van den Hengel, "Learning discriminative trajectorylet detector sets for accurate skeletonbased action recognition," Pattern Recognition, vol. 66, pp. 202-212, 2017. https://doi.org/10.1016/j.patcog.2017.01.015
  25. J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, "Robust face recognition via sparse representation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, 2009. https://doi.org/10.1109/TPAMI.2008.79
  26. H. Zhang, J. Yang, J. Xie, J. Qian, and B. Zhang, "Weighted sparse coding regularized nonconvex matrix regression for robust face recognition," Information Sciences, vol. 394, pp. 1-17, 2017.
  27. J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. S. Huang, and S. Yan, "Sparse representation for computer vision and pattern recognition," Proceedings of the IEEE, vol. 98, no. 6, pp. 1031-1044, 2010. https://doi.org/10.1109/JPROC.2010.2044470
  28. K. Guo, P. Ishwar, and J. Konrad, "Action recognition in video by sparse representation on covariance manifolds of silhouette tunnels," in Recognizing Patterns in Signals, Speech, Images and Videos. Heidelberg: Springer, 2010, pp. 294-305.
  29. C. Yuan, W. Hu, G. Tian, S. Yang, and H. Wang, "Multi-task sparse learning with beta process prior for action recognition," in Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, 2013, pp. 423-429.
  30. C. Chen, R. Jafari, and N. Kehtarnavaz, "Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor," in Proceedings of 2015 IEEE International Conference on Image Processing, Quebec City, Canada, 2015, pp. 168-172.
  31. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, 2005, pp. 886-893.
  32. Y. P. Hsu, C. Liu, T. Y. Chen, and L. C. Fu, "Online viewinvariant human action recognition using RGB-D spatiotemporal matrix," Pattern Recognition, vol. 60, pp. 215-226, 2016. https://doi.org/10.1016/j.patcog.2016.05.010
  33. X. Yang, C. Zhang, and Y. Tian, "Recognizing actions using depth motion maps-based histograms of oriented gradients," in Proceedings of the 20th ACM International Conference on Multimedia, Nara, Japan, 2012, pp. 1057-1060.
  34. L. Zhang, M. Yang, and X. Feng, "Sparse representation or collaborative representation: which helps face recognition?," in Proceedings of 2011 IEEE International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 471-478.
  35. C. Chen, K. Liu, and N. Kehtarnavaz, "Real-time human action recognition based on depth motion maps," Journal of Real-Time Image Processing, vol. 12, no. 1, pp. 155-163, 2016. https://doi.org/10.1007/s11554-013-0370-1
  36. T. Kerola, N. Inoue, and K. Shinoda, "Graph regularized implicit pose for 3D human action recognition," in Proceedings of 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, Korea, 2016, pp. 1-4.