Sparse Feature Convolutional Neural Network with Cluster Max Extraction for Fast Object Classification

  • Kim, Sung Hee (Dept. of Electrical Engineering, Korea University) ;
  • Pae, Dong Sung (Dept. of Electrical Engineering, Korea University) ;
  • Kang, Tae-Koo (Dept. of Digital Electronics, Inha Technical College) ;
  • Kim, Dong W. (Dept. of Information and Telecommunication Engineering, Sangmyung University) ;
  • Lim, Myo Taeg (Dept. of Electrical Engineering, Korea University)
  • Received : 2018.03.27
  • Accepted : 2018.05.28
  • Published : 2018.11.01


We propose the Sparse Feature Convolutional Neural Network (SFCNN) to reduce the volume of convolutional neural networks (CNNs). Despite the superior classification performance of CNNs, their enormous network volume requires high computational cost and long processing time, making real-time applications such as online-training difficult. We propose an advanced network that reduces the volume of conventional CNNs by producing a region-based sparse feature map. To produce the sparse feature map, two complementary region-based value extraction methods, cluster max extraction and local value extraction, are proposed. Cluster max is selected as the main function based on experimental results. To evaluate SFCNN, we conduct an experiment with two conventional CNNs. The network trains 59 times faster and tests 81 times faster than the VGG network, with a 1.2% loss of accuracy in multi-class classification using the Caltech101 dataset. In vehicle classification using the GTI Vehicle Image Database, the network trains 88 times faster and tests 94 times faster than the conventional CNNs, with a 0.1% loss of accuracy.


Supported by : National Research Foundation of Korea (NRF)


  1. B. Zhao, J. Feng, X. Wu and S. Yan, "A survey on deep learning-based fine-grained object classification and semantic segmentation," International Journal of Automation and Computing, vol. 14, no. 2, pp. 119-135, April 2017.
  2. C. Stauer and W. E. L. Grimson, "Learning patterns of activity using real-time tracking," IEEE Trans. Pattern Anal. Mach. Intell, vol. 22, no. 8, pp. 747-757, August 2000.
  3. A. Ucar, Y. Demir and C. Guzelis, "Object recognition and detection with deep learning for autonomous driving applications," Simulation, vol. 93, no. 9, pp. 759-769, June 2017.
  4. B. Babenko, M. H. Yang, and S. Belongie, "Robust object tracking with online multiple instance learning," IEEE Trans. Pattern Anal. Mach. Intell, vol. 33, no. 8, pp. 1619-1632, August 2011.
  5. A. Giusti, J. Guzzi, D. C. Ciresan, F. L. He, J. P. Rodriguez, F. Fontana, M. Faessler, C. Forster, J. Schmidhuber, G. D. Caro, D. Scaramuzza, and L. M. Gambardella, "A machine learning approach to visual perception of forest trails for mobile robots," IEEE Robot. Autom. Lett, vol. 1, no. 2, pp. 661-667, July 2016.
  6. D. Geronimo, A. M. Lopez, A. D. Sappa and T. Graf, "Survey of pedestrian detection for advanced driver assistance systems," IEEE Trans. Pattern Anal. Mach. Intell, vol. 32, no. 7, pp. 1239-1258, May 2009.
  7. A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
  8. S. Herath, M. Harandi and F. Porikl, "Going deeper into action recognition: A survey," Image and Vision Computing, vol. 60, pp. 4-21, April 2017.
  9. K. He, X. Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  10. K. He, G. Gkioxari, P. Dollar and R. Girshick, "Mask r-cnn," 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988, 2017.
  11. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," Proc. International Conference on Learning Representations,, 2014.
  12. N. Qian, "On the momentum term in gradient descent learning algorithms," Neural networks, vol. 12, no. 1, pp. 145-151, January 1999.
  13. [Online] Available:
  14. S. Hinterstoisser, V. Lepetit, P. Wohlhart and K. Konolige, "On Pre-Trained Image Features and Synthetic Images for Deep Learning," arXiv preprint arXiv:1710.10710, 2017.
  15. J. Wang, C. Luo, H. Huang, H. Zhao and S. Wang, "Transferring pre-trained deep CNNs for remote scene classification with general features learned from linear PCA network," Remote Sensing, vol. 9, no. 3, p. 225, March 2017.
  16. Y. Wei, W. Xia, M. Lin, J. Huang, B. Ni, J. Dong, Y. Zhao and S. Yan, "Hcp: A flexible CNN framework for multi-label image classification," IEEE Trans. Pattern Anal. Mach. Intell, vol. 38, no. 9, pp. 1901-1907, September 2016.
  17. R. Girshick, J. Donahue, T. Darrell and J. Malik, "Region-based convolutional networks for accurate object detection and segmentation," IEEE Trans. Pattern Anal. Mach. Intell, vol. 38, no. 1, pp. 142-158, January 2016.
  18. S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, "Object detection networks on convolutional feature maps," IEEE Trans. Pattern Anal. Mach. Intell, vol. 39, no. 7, pp. 1476-1481, July 2017.
  19. Y. Ke and R. Sukthanka, "PCA-SIFT: A More Distinctive Representation for Local Image Descriptors," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2004.
  20. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio, "Robust Object Recognition with Cortex-Like Mechanisms," IEEE Trans. Pattern Anal. Mach. Intell, vol. 29, no. 3, pp. 411-426, March 2007
  21. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei. "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, December 2015.
  22. X, Chen, X. Lin, "Big Data Deep Learning: Challenges and Perspectives," IEEE Access, vol. 2, pp. 514-525, May 2014.