Title :
Entropy-Based Graph Clustering: Application to Biological and Social Networks
Author :
Kenley, Edward Casey ; Cho, Young-Rae
Author_Institution :
Dept. of Comput. Sci., Baylor Univ., Waco, TX, USA
Abstract :
Complex systems have been widely studied to characterize their structural behaviors from a topological perspective. High modularity is one of the recurrent features of real-world complex systems. Various graph clustering algorithms have been applied to identifying communities in social networks or modules in biological networks. However, their applicability to real-world systems has been limited because of the massive scale and complex connectivity of the networks. In this study, we exploit a novel information-theoretic model for graph clustering. The entropy-based clustering approach finds locally optimal clusters by growing a random seed in a manner that minimizes graph entropy. We design and analyze modifications that further improve its performance. Assigning priority in seed-selection and seed-growth is well applicable to the scale-free networks characterized by the hub-oriented structure. Computing seed-growth in parallel streams also decomposes an extremely large network efficiently. The experimental results with real biological and social networks show that the entropy-based approach has better performance than competing methods in terms of accuracy and efficiency.
Keywords :
complex networks; graph theory; large-scale systems; network theory (graphs); pattern clustering; social networking (online); biological networks; entropy based graph clustering; hub oriented structure; information theoretic model; real world complex systems; scale free networks; seed growth; seed selection; social networks; Accuracy; Clustering algorithms; Entropy; MySpace; Proteins; biological networks; complex systems; graph clustering; graph entropy; graph mining; social networks;
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
Print_ISBN :
978-1-4577-2075-8
DOI :
10.1109/ICDM.2011.64