Nearest-Neighbors Based Weighted Method for the BOVW Applied to Image Classification

DOI QR코드

DOI QR Code

Xu, Mengxi;Sun, Quansen;Lu, Yingshu;Shen, Chenming

  • 투고 : 2014.11.27
  • 심사 : 2015.04.13
  • 발행 : 2015.07.01

초록

This paper presents a new Nearest-Neighbors based weighted representation for images and weighted K-Nearest-Neighbors (WKNN) classifier to improve the precision of image classification using the Bag of Visual Words (BOVW) based models. Scale-invariant feature transform (SIFT) features are firstly extracted from images. Then, the K-means++ algorithm is adopted in place of the conventional K-means algorithm to generate a more effective visual dictionary. Furthermore, the histogram of visual words becomes more expressive by utilizing the proposed weighted vector quantization (WVQ). Finally, WKNN classifier is applied to enhance the properties of the classification task between images in which similar levels of background noise are present. Average precision and absolute change degree are calculated to assess the classification performance and the stability of K-means++ algorithm, respectively. Experimental results on three diverse datasets: Caltech-101, Caltech-256 and PASCAL VOC 2011 show that the proposed WVQ method and WKNN method further improve the performance of classification.

키워드

Bag of visual words;K-means++ algorithm;Weighted vector quantization;Weighted K-nearest-neighbors classifier

참고문헌

  1. S. Gao, I. W. H. Tsang, L. T. Chia, “Sparse representation with kernels”, Image Processing, 22(2): 423-434, 2013.
  2. H. Zhang, A. Berg, M. Maire, et al., “SVM-KNN: Discriminative nearest neighbor classification for visual category recognition,” Computer Vision and Pattern Recognition, 2: 2126-2136, 2006.
  3. M. L. Zhang, Z. H. Zhou, “ML-KNN: A lazy learning approach to multi-label learning,” Pattern recognition, 40(7): 2038-2048, 2007. https://doi.org/10.1016/j.patcog.2006.12.019
  4. H. Lee, A. Battle, R. Raina and A. Y. Ng, “Efficient sparse coding algorithms,” Advances in neural information processing systems, 801-808 2006.
  5. L. Fei-Fei, R. Fergus, P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,” Computer Vision and Image Understanding, 106(1): 59-70, 2007. https://doi.org/10.1016/j.cviu.2005.09.012
  6. L. Bourdev, S. Maji, J. Malik, “Describing people: A poselet-based approach to attribute classification,” Computer Vision (ICCV), pp. 1543-1550, 2011.
  7. G. Griffin, A. Holub, and P. Perona, “Caltech-256 object category dataset. Technical Report USB/CSD-04-1366,” California Institute of Technology, 2007.
  8. C. Vondrick, A. Khosla, T. Malisiewicz, et al., “Hoggles: Visualizing object detection features,” Computer Vision (ICCV), 1-8, 2013.
  9. T. Malisiewicz, A. Gupta, A. A. Efros, “Ensemble of exemplar-svms for object detection and beyond,” Computer Vision (ICCV), 89-96, 2011.
  10. D. G. Lowe, “Distinctive image features from scaleinvariant keypoints,” International journal of computer vision, 60(2): 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  11. J. C. Yang, K. Yu, Y. H. Gong and T. Huang, “Linear spatial pyramid matching using sparse coding for image classification,” Computer Vision Pattern Recognition, 1794-1801, 2009.
  12. J. Yu, M. Jeon, W. Pedrycz, “Weighted feature trajectories and concatenated bag-of-features for action recognition”, Neurocomputing, 131: 200-207, 2014. https://doi.org/10.1016/j.neucom.2013.10.024
  13. A. Plinge, R Grzeszick, G. A. Fink, “A bag-of-features approach to acoustic event detection”, Acoustics, Speech and Signal Processing, 3704-3708, 2014.
  14. A. Bocsh, A. Zisserman, et al., “Image Classification using Random Forests and Ferns,” Computer Vision (ICCV), 1-8, 2007.
  15. A. Opelt, M. Fussenegger, A. Pinz, and P. Auer, “Weak hypotheses and boosting for generic object detection and recognition,” Computer Vision (ECCV), 71-84, 2004.
  16. O. Boiman, E. Shechtman, M. Irani, “In defense of Nearest-Neighbor based image classification,” Computer Vision Pattern Recognition, 1-8, 2008.
  17. D. Arthur, S. Vassilvitskii, “K-means++: The advantages of careful seeding,” Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, 1027-1035, 2007.
  18. S. Agarwal, S. Yadav, K. Singh., “K-means versus K-means++ clustering technique,” Students Conference on Engineering and Systems, 1-6, 2012.
  19. K. Wagstaff, C. Cardie, S. Rogers, et al., “Constrained K-means clustering with background knowledge,” International Conference on Machine Learning, 1: 577-584, 2001.
  20. W. Tao, Y. Zhou, L. Liu, et al., “Spatial adjacent bag of features with multiple super pixels for object segmentation and classification”. Information Sciences, 281: 373-385, 2014. https://doi.org/10.1016/j.ins.2014.05.032
  21. A. Shi, L. Xu, F. Xu, et al., “Multispectral and panchromatic image fusion based on improved bilateral filter,” Journal of Applied Remote Sensing, 5(1): 053542-1-053542-17, 2011. https://doi.org/10.1117/1.3616010
  22. F. Xu, T. Fan, C. Huang, et al., “Block-Based MAP Super-resolution Using Feature-Driven Prior Model,” Mathematical Problems in Engineering, 48(1):331-350, 2014.
  23. J. Wang, J. Yang, K. Yu, et al., “Locality-constrained linear coding for image classification,” Computer Vision Pattern Recognition, 3360-3367, 2010.
  24. M. Zhang, A. A. Sawchuk, “Motion primitive-based human activity recognition using a bag-of-features approach,” Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, 631-640, 2012.
  25. L. Zhou, Z. Zhou, D. Hu, “Scene classification using a multi-resolution bag-of-features model,” Pattern Recognition, 46(1): 424-433, 2013. https://doi.org/10.1016/j.patcog.2012.07.017
  26. S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” Computer Vision Pattern Recognition, 2169-2178, 2006.