DOI QR코드

DOI QR Code

Trends in Online Action Detection in Streaming Videos

온라인 행동 탐지 기술 동향

  • Published : 2021.04.01

Abstract

Online action detection (OAD) in a streaming video is an attractive research area that has aroused interest lately. Although most studies for action understanding have considered action recognition in well-trimmed videos and offline temporal action detection in untrimmed videos, online action detection methods are required to monitor action occurrences in streaming videos. OAD predicts action probabilities for a current frame or frame sequence using a fixed-sized video segment, including past and current frames. In this article, we discuss deep learning-based OAD models. In addition, we investigated OAD evaluation methodologies, including benchmark datasets and performance measures, and compared the performances of the presented OAD models.

Keywords

References

  1. J. Gao, Z. Yang, and R. Nevatia, "Red: Reinforced encoderdecoder networks for action anticipation," in Proc. Bri. Mach. Vis. Conf. (BMVC), London, UK, Sept. 2017, pp. 92.1-92.11.
  2. M. Xu et al., "Temporal recurrent networks for online action detection," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul, Rep. of Korea, Oct. 2019, pp. 5532-5541.
  3. H. Eun et al., "Learning to discrimiate information for online action detection," in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA, USA, June 2020, pp. 806-815.
  4. H. Eun et al., "Temporal filtering networks for online action detection," Pattern Recognit. (PR), vol. 111, Mar. 2021.
  5. R. De Geest et al., "Online action detection," in Proc. Eur. Conf. Comput. Vis. (ECCV), Glasgow, UK, Oct. 2016, pp. 269-285.
  6. Y.-G. Jiang et al., "Challenge: Action recognition with a large number of classes," ECCV'14 THUMOS, 2014, http://crcv.ucf.edu/THUMOS14/
  7. L. Wang et al., "Temporal segment networks: Towards good practices for deep action recognition," in Proc. Eur. Conf. Comput. Vis. (ECCV), Amsterdam, Netherlands, Oct. 2016, pp. 20-36.
  8. R. De Geest and T. Tuytelaars, "Modeling temporal structure with lstm for online action detection," in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Lake Tahoe, NV, USA, Mar. 2018, pp. 1549-1557.
  9. F. C. Heilbron et al., "ActivityNet: A large-scale video benchmark for human activity understanding," in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit. (CVPR), Boston, MA, USA, June 2015.
  10. J. Carreira and A. Zisserman, "Quo vadis, action recognition? a new model and the kinetics dataset," in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, July 2017, pp. 4724-4733.
  11. S. Yeung et al., "Every moment counts: Dense detailed labeling of actions in complex videos," Int. J. Comput. Vis. vol. 126, 2018, pp. 375-389. https://doi.org/10.1007/s11263-017-1013-y
  12. Z. Shou et al., "Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos," in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, July 2017, pp. 1417-1426.