DOI QR코드

DOI QR Code

Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors

  • Ye, Xiucai (Department of Computer Science, University of Tsukuba) ;
  • Sakurai, Tetsuya (Department of Computer Science, University of Tsukuba)
  • Received : 2015.06.05
  • Accepted : 2016.03.02
  • Published : 2016.06.01

Abstract

Spectral clustering is a powerful tool for exploratory data analysis. Many existing spectral clustering algorithms typically measure the similarity by using a Gaussian kernel function or an undirected k-nearest neighbor (kNN) graph, which cannot reveal the real clusters when the data are not well separated. In this paper, to improve the spectral clustering, we consider a robust similarity measure based on the shared nearest neighbors in a directed kNN graph. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors, and one based on their closeness. The proposed algorithms are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Moreover, the proposed algorithms have only one parameter, k. We evaluated the proposed algorithms using synthetic and real-world datasets. The experimental results demonstrate that the proposed algorithms not only achieve a good level of performance, they also outperform the traditional spectral clustering algorithms.

Keywords

References

  1. U. Von Luxburg, "A Tutorial on Spectral Clustering," Statistics Comput., vol. 17, no. 4, Dec. 2007, pp. 395-416. https://doi.org/10.1007/s11222-007-9033-z
  2. J. Malik et al., "Contour and Texture Analysis for Image Segmentation," Int. J. Comput. Vision, vol. 43, no. 1, June 2001, pp. 7-27. https://doi.org/10.1023/A:1011174803800
  3. B. Hendrickson and R. Leland, "An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations," SIAM J. Scientific Comput., vol. 16, no. 2, 1995, pp. 452-469. https://doi.org/10.1137/0916028
  4. W. Hu et al., "Semantic Based Surveillance Video Retrieval," IEEE Trans. Image Process., vol. 16, no. 4, 2007, pp. 1168-1181. https://doi.org/10.1109/TIP.2006.891352
  5. Z. Yu et al., "Sc3: Triple Spectral Clustering Based Consensus Clustering Framework for Class Discovery from Cancer Gene Expression Profiles," IEEE/ACM Trans. Comput. Biology Bioinfomatics, vol. 9, no. 6, Dec. 2012, pp. 1751-1765. https://doi.org/10.1109/TCBB.2012.108
  6. A.Y. Ng, M.I. Jordan, and Y. Weiss, "On Spectral Clustering: Analysis and an Algorithm," NIPS Proc., 2001, pp. 849-856.
  7. L. Zelnik-Manor and P. Perona, "Self-Tuning Spectral Clustering," NIPS Proc., 2004, pp. 1601-1608.
  8. H. Chang and D. Yeung, "Robust Path-Based Spectral Clustering," Pattern Recogn., vol. 41, no. 1, 2008, pp. 191-203. https://doi.org/10.1016/j.patcog.2007.04.010
  9. X. Zhang, J. Li, and H. Yu, "Local Density Adaptive Similarity Measurement for Spectral Clustering," Pattern Recogn. Lett., vol. 32, no. 2, Jan. 2011, pp. 352-358. https://doi.org/10.1016/j.patrec.2010.09.014
  10. X. He, S. Zhang, and Y. Liu, "An Adaptive Spectral Clustering Algorithm Based on the Importance of Shared Nearest Neighbors," Algorithms, vol. 8, no. 2, May 2015, pp. 177-189. https://doi.org/10.3390/a8020177
  11. J. Cao et al., "A Max-Flow-Based Similarity Measure for Spectral Clustering," ETRI J., vol. 35, no. 2, Apr. 2013, pp. 311-320. https://doi.org/10.4218/etrij.13.0112.0520
  12. M. Lucinska and S.T. Wierzchon, "Spectral Clustering Based on k-Nearest Neighbor Graph," Lecture Notes Comput. Sci., vol. 7564, 2012, pp. 254-265.
  13. R.A. Jarvis and E.A. Patrick, "Clustering Using a Similarity Measure Based on Shared Near Neighbors," IEEE Trans. Comput., vol. C-22, no. 11, Nov. 1973, pp. 1025-1034. https://doi.org/10.1109/T-C.1973.223640
  14. L. Ertoz, M. Steinbach, and V. Kumar, "Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data," Proc. SIAM Int. Conf. Data Mining, San Francisco, CA, USA, May 1-3, 2003, pp. 47-58.
  15. M.E. Houle et al., "Can Shared-Neighbor Distances Defeat the Curse of Dimensionality?" Lecture Notes Comput. Sci., vol. 6187, 2010, pp. 482-500.
  16. M. Beauchemin, "A Density-Based Similarity Matrix Construction for Spectral Clustering," Neurocomputing, Mar. 2015, pp. 835-844. https://doi.org/10.1016/j.neucom.2014.10.012
  17. X. Zhu, C.C. Loy, and S. Gong, "Constructing Robust Affinity Graphs for Spectral Clustering," IEEE Conf. Computer Vision Patten Recogn., Columbus, OH, USA, June 23-28, 2014, pp. 1450-1457.
  18. C. Fowlkes et al., "Spectral Grouping Using the Nystrom Method," IEEE Trans. Pattern Anal.. Mach. Intell., vol. 26, no. 2, Feb. 2004, pp. 214-225. https://doi.org/10.1109/TPAMI.2004.1262185
  19. H. Wang, J. Chen, and K. Guo, "A Genetic Spectral Clustering Algorithm," Int. J. Comput. Inf. Syst., vol. 7, no. 9, 2011, pp. 3245-3252.
  20. H.D. Menendez, D.F. Barrero, and D. Camacho, "A Genetic Graph-Based Approach for Partitional Clustering," Int. J. Neural Syst., vol. 24, no. 3, May 2014.
  21. X. Ye and T. Sakurai, "Spectral Clustering Using Robust Similarity Measure Based on Closeness of Shared Nearest Neighbors," Int. Joint Conf. Neural Netw., Killarney, Ireland, July 12-17, 2015, pp. 1-8.
  22. J. McNames, "A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree," IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 9, Sept. 2011, pp. 964-976.
  23. W. Chen et al., "Parallel Spectral Clustering in Distributed Systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 3, Mar. 2011, pp. 568-586. https://doi.org/10.1109/TPAMI.2010.88
  24. A. Strehl and J. Ghosh, "Cluster Ensembles: A Knowledge Reuse Framework for Combining Multiple Partitions," J. Mach. Learn. Research, vol. 3, Mar. 2003, pp. 583-617.
  25. UCI Machine Learning Repository, Accessed Mar. 1, 2015. http://archive.ics.uci.edu/ml/
  26. Y. Lwcun, C. Cortes, and J.C. Burges, The MNIST Database of Handwritten Digits, Accessed Mar. 1, 2015. http://yann.lecun.com/exdb/mnist/
  27. S. Nene, S. Nayar, and J. Murase, Columbia Object Image Library (COIL-20), Dept. Comp. Sci., Columbia Univ., New York, Tech. Rep. CUCS-005-96, 1996.
  28. Yale Face Database, Accessed Mar. 1, 2015. http://vision.ucsd.edu/content/yale-facedatabase

Cited by

  1. An Approach to Improve Generation of Association Rules in Order to Be Used in Recommenders : vol.13, pp.4, 2017, https://doi.org/10.4018/ijdwm.2017100101
  2. Analysis and Evaluation of a Framework for Sampling Database in Recommenders : vol.26, pp.1, 2018, https://doi.org/10.4018/jgim.2018010103
  3. The Role of the Internet of Things in the Improvement and Expansion of Business : vol.30, pp.3, 2016, https://doi.org/10.4018/joeuc.2018070102
  4. Spectral clustering with adaptive similarity measure in Kernel space vol.22, pp.4, 2016, https://doi.org/10.3233/ida-173436
  5. Detecting Interactive Gene Groups for Single-Cell RNA-Seq Data Based on Co-Expression Network Analysis and Subgraph Learning vol.9, pp.9, 2020, https://doi.org/10.3390/cells9091938
  6. An analytical framework based on the recency, frequency, and monetary model and time series clustering techniques for dynamic segmentation vol.192, pp.None, 2022, https://doi.org/10.1016/j.eswa.2021.116373