DOI QR코드

DOI QR Code

Robust appearance feature learning using pixel-wise discrimination for visual tracking

  • Kim, Minji (Division of Computer Science and Engineering, Chonbuk National University) ;
  • Kim, Sungchan (Division of Computer Science and Engineering, Chonbuk National University)
  • Received : 2018.08.30
  • Accepted : 2019.02.13
  • Published : 2019.08.02

Abstract

Considering the high dimensions of video sequences, it is often challenging to acquire a sufficient dataset to train the tracking models. From this perspective, we propose to revisit the idea of hand-crafted feature learning to avoid such a requirement from a dataset. The proposed tracking approach is composed of two phases, detection and tracking, according to how severely the appearance of a target changes. The detection phase addresses severe and rapid variations by learning a new appearance model that classifies the pixels into foreground (or target) and background. We further combine the raw pixel features of the color intensity and spatial location with convolutional feature activations for robust target representation. The tracking phase tracks a target by searching for frame regions where the best pixel-level agreement to the model learned from the detection phase is achieved. Our two-phase approach results in efficient and accurate tracking, outperforming recent methods in various challenging cases of target appearance changes.

Keywords

References

  1. Y. Wu, J. Lim, and M.‐H. Yang, Online object tracking: a benchmark, in IEEE Conf Comput Vision Pattern Recogn., Portland, OR, USA, June 2013, pp. 2411-2418.
  2. Y. Wu et al., Object tracking benchmark, IEEE Trans. Pattern Anal. Mach. Intell. 37 (2015), no. 9, 1834-1848. https://doi.org/10.1109/TPAMI.2014.2388226
  3. C. Ma et al., Hierarchical convolutional features for visual tracking, in IEEE Int Conf. Comput. Vision, Santiago, Chile, Dec. 2015, pp. 3074-3082.
  4. S. Hong et al., Online tracking by learning discriminative saliency map with convolutional neural network, in Int. Conf. Machine Learn., Lille, France, July 2015, pp. 597-606.
  5. Y. Song et al., VITAL: VIsual Tracking via Adversarial Learning, in IEEE/CVF Conf. COmput. CIsion Pattern Recogn., Salt Lake City, UT, USA, June 2018, pp. 8990-8999.
  6. H. Nam and B. Han, Learning multi‐domain convolutional neural networks for visual tracking, IEEE Conf. Comput. Vision Pattern Recogn., Las Vegas, NV, USA, June 2016, pp. 4293-4302.
  7. M. Danelljan et al., ECO: efficient convolution operators for tracking, in IEEE Conf. Comput. Vision Pattern Recogn., Honolulu, HI, USA, July 2017, pp. 6638-6646.
  8. J. Deng et al., ImageNet: a large‐scale hierarchical image database, IEEE Conf. Omput. Vision Pattern Recogn., Miami, FL, USA, June 2009, pp. 248-255.
  9. M. Kristan et al., A novel performance evaluation methodology for single‐target trackers, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2016), no. 11, 2137-2155. https://doi.org/10.1109/TPAMI.2016.2516982
  10. N. Alt, S. Hinterstoisser, and N. Navab, Rapid selection of reliable templates for visual tracking, in IEEE Comput. Soc. Conf. Comput. Vision Pattern Recogn., San Francisco, CA, USA, June 2010, pp. 1355-1362.
  11. X. Jia, H. Lu, and M.-H. Yang, Visual tracking via adaptive structural local sparse appearance model, in IEEE Conf. Comput. Vision Pattern Recogn., Providence, RI, USA, June 2012, pp. 1257-1264.
  12. D. Lee, J. Sim, and C. Kim, Visual tracking using pertinent patch selection and masking, in IEEE Conf. Comput. Vision Pattern Recogn., Columbus, OH, USA, June 2014, pp. 3486-3493.
  13. W. Zhong, H. Lu, and M.-H. Yang, Robust object tracking via sparsity‐based: collaborative model, in IEEE Conf. Comput. Vision Pattern Recogn., Providence, RI, USA, June 2012, pp. 1838-1845.
  14. B. Hariharan et al., Hypercolumns for object segmentation and fine‐grained localization, in IEEE Conf. Comput. Vision Pattern Recogn., Boston, MA, USA, June 2015) pp. 447-456.
  15. D. Wang et al., Fast and effective color‐based object tracking by boosted color distribution, Pattern Anal. Appicat. 16 (2013), no. 4, 647-661. https://doi.org/10.1007/s10044-013-0347-5
  16. H. Lu et al., Pixel‐wise spatial pyramid‐based hybrid tracking, IEEE Trans. Circuits Syst. Video Technol. 22 (2012), no. 9, 1365-1376. https://doi.org/10.1109/TCSVT.2012.2201794
  17. H. Song, Y. Zheng, and K. Zhang, Robust visual tracking via self‐similarity learning, Electron. Lett. 53 (2016), no. 1, 20-22. https://doi.org/10.1049/el.2016.3011
  18. W. Chen, K. Zhang, and Q. Liu, Robust visual tracking via patch based kernel correlation filters with adaptive multiple feature ensemble, Neurocomput. 214 (2016), 607-617. https://doi.org/10.1016/j.neucom.2016.06.048
  19. K. Zhang et al., Visual tracking using spatio‐temporally nonlocally regularized correlation filter, Pattern Recogn. 83 (2018), 185-195. https://doi.org/10.1016/j.patcog.2018.05.017
  20. K. Zhang et al., Parallel attentive correlation tracking, IEEE Trans. Image Proc. 28 (2019), no. 1, 479-491. https://doi.org/10.1109/TIP.2018.2868561
  21. J. Fan et al., Complementary tracking via dual color clustering and spatio‐temporal regularized correlation learning, IEEE Access 6 (2018), 56526-56538. https://doi.org/10.1109/ACCESS.2018.2872691
  22. J. Yang, K. Zhang, and Q. Liu, Robust object tracking by online Fisher discrimination boosting feature selection, Comput. Vis. Image Underst. 153 (2016), 100-108. https://doi.org/10.1016/j.cviu.2016.02.003
  23. K. Zhang et al., Visual tracking via boolean map representations, Pattern Recog. 81 (2018), 147-160. https://doi.org/10.1016/j.patcog.2018.03.029
  24. X. Lan et al., Learning common and feature‐specific patterns: a novel multiple‐sparse‐representation‐based tracker, IEEE Trans. Image Proc. 27 (2018), no. 4, 2022-2037. https://doi.org/10.1109/TIP.2017.2777183
  25. X. Lan et al., Modality‐correlation‐aware sparse representation for RGB‐infrared object tracking, Pattern Recog. Lett. (2018), in press, doi: 10.1016/j.patrec.2018.10.002.
  26. X. Lan, A. J. Ma, and P. C. Yuen, Multi‐cue visual tracking using robust feature‐level fusion based on joint sparse representation, in IEEE Conf. Comput. Vision Pattern Recogn., Columbus, OH, USA, June 2014, pp. 1194-1201.
  27. X. Lan et al., Joint sparse representation and robust feature‐level fusion for multi‐cue visual tracking, IEEE Trans. Image Process. 24 (2015), no. 12, 5826-5841. https://doi.org/10.1109/TIP.2015.2481325
  28. C. Rother, V. Kolmogorov, and A. Blake, Grabcut: interactive foreground extraction using iterated graph cuts, ACM Trans. Graphics 23 (2014), no. 3, 309-314. https://doi.org/10.1145/1015706.1015720
  29. Y.‐H. Tsai, M.‐H. Yang, and M. Black, Video segmentation via object flow, in IEEE Conf. Comput. Vision Pattern Recogn., Las Vegas, NV, USA, June 2016, pp. 3899-3908.
  30. X. Lan, S. Zhang, and P. C. Yuen, Robust joint discriminative feature learning for visual tracking, in Proc. Int. Joint Conf. Artif. Intell., New York, USA, July 2016, pp. 3403-3410.
  31. X. Lan, P. C. Yuen, and R. Chellappa, Robust MIL‐based feature template learning for object tracking, in Proc. AAAI Conf. Artif. Intell., San Francisco, CA, USA, Feb. 2017, pp. 4118-4125.
  32. X. Lan et al., Robust collaborative discriminative learning for RGB‐infrared tracking, in Proc. AAAI Conf. Artif. Intell., New Orleans, LA, USA, Feb. 2018, pp. 7008-7015.
  33. K. Simonyan and A. Zisserman, Very deep convolutional networks for large‐scale image recognition, arXiv:1409.1556., (2014).
  34. S. Hare, A. Saffari, and P. H. S. Torr, Struck: structured output tracking with kernels, in Int. Conf. Comput. Vision, Barcelona, Spain, Nov. 2011, pp. 263-270.
  35. Z. Kalal, K. Mikolajczyk, and J. Matas, Tracking‐learning‐detection, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012), no. 7, 1409-1422. https://doi.org/10.1109/TPAMI.2011.239

Cited by

  1. Combining Spatio-Temporal Context and Kalman Filtering for Visual Tracking vol.7, pp.11, 2019, https://doi.org/10.3390/math7111059
  2. CitiusSynapse: A Deep Learning Framework for Embedded Systems vol.11, pp.23, 2019, https://doi.org/10.3390/app112311570