JOURNAL BROWSE
Search
Advanced SearchSearch Tips
An Efficient Large Graph Clustering Technique based on Min-Hash
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
  • Journal title : Journal of KIISE
  • Volume 43, Issue 3,  2016, pp.380-388
  • Publisher : Korean Institute of Information Scientists and Engineers
  • DOI : 10.5626/JOK.2016.43.3.380
 Title & Authors
An Efficient Large Graph Clustering Technique based on Min-Hash
Lee, Seok-Joo; Min, Jun-Ki;
 
 Abstract
Graph clustering is widely used to analyze a graph and identify the properties of a graph by generating clusters consisting of similar vertices. Recently, large graph data is generated in diverse applications such as Social Network Services (SNS), the World Wide Web (WWW), and telephone networks. Therefore, the importance of graph clustering algorithms that process large graph data efficiently becomes increased. In this paper, we propose an effective clustering algorithm which generates clusters for large graph data efficiently. Our proposed algorithm effectively estimates similarities between clusters in graph data using Min-Hash and constructs clusters according to the computed similarities. In our experiment with real-world data sets, we demonstrate the efficiency of our proposed algorithm by comparing with existing algorithms.
 Keywords
graph clustering;min-hash;data mining;large graph;
 Language
Korean
 Cited by
1.
스토리지 내 프로세싱 방식을 사용한 그래프 프로세싱의 최적화 방법,송내영;한혁;염헌영;

정보과학회 컴퓨팅의 실제 논문지, 2017. vol.23. 8, pp.473-480 crossref(new window)
 References
1.
U. Kang, and C. Faloutsos, "Big Graph Mining: Algorithms and Discoveries," SIGKDD Explorations, Vol. 14, No. 2, pp. 29-36, 2012.

2.
M. E. J. Newman, and M. Girvan, "Finding and evaluating community structure in networks," Physical review E, Vol. 69, No. 2, 2004.

3.
N. Mishra, R. Schreiber, I. Stanton, and R. E. Tarjan, "Clustering social networks," Algoriths and Models for the Web-Graph, LNCS, Vol. 4863, pp. 56-67, 2007.

4.
T. Haveliwala, A. Gionis, and P. Indyk, "Scalable techniques for clustering the web," Proc. of WebDB Workshop, 2000.

5.
J. Shi, and J. Malik, "Normalized Cuts and Image Segmentation," Pattern Analysis and Machine Intelligence, Vol. 22, No. 8, pp. 888-905, 2000. crossref(new window)

6.
K. Macropol, T. Can, and AK. Singh, "RRW: repeated random walks on genome-scale protein networks for local cluster discovery," BMC Bioinformatics, Vol. 10, No. 1, pp. 283, 2009. crossref(new window)

7.
L. Wang, Y. Xiao, B. Shao, and H. Wang, "How to partition a billion-node graph," Proc. of IEEE ICDE, pp. 568-579, 2014.

8.
AZ. Broder, "On the resemblance and containment of documents," Proc. of Compression and Complexity of Sequences, pp. 21-29, 1997.

9.
AZ. Broder, SC. Glassman, MS. Manasse, and G. Zweig, "Syntactic clustering of the web," Computer Networks and ISDN Systems, Vol. 29, No. 8, pp. 1157-1166, 1997. crossref(new window)

10.
A. Rajaraman, and J. D. Ullman, "Mining of massive datasets," Cambridge University Press, 2011.

11.
X. Liu, Y. Zhou, C. Hu, X. Huan, and J. Leng, "Detecting community structure for undirected big graphs based on random walks," Proc. of WWW, pp. 1151-1156, 2014.

12.
S. V. Dongen, "Graph Clustering by Flow Simulation," PhD thesis, University of Utrecht, 2000.

13.
X. Xu, N. Yuruk, Z. Feng, and TAJ. Schweiger, "Scan: a structural clustering algorithm for networks," Proc. of ACM SIGKDD, pp. 824-833, 2007.

14.
K. Macropol, and A. Singh, "Scalable discovery of best clusters on large graphs," Proc. of VLDB Endowment, pp. 693-702, 2010.

15.
R. Kannan, S. Vempala, and A Vetta, "On clusterings: Good, bad and spectral," Journal of the ACM (JACM), Vol. 51, No. 3, pp. 497-515, 2004. crossref(new window)