Author :
Berlingerio, Michele ; Koutra, Danai ; Eliassi-Rad, Tina ; Faloutsos, Christos
Abstract :
Given a set of k networks, possibly with different sizes and no overlaps in nodes or links, how can we quickly assess similarity between them? Analogously, are there a set of social theories which, when represented by a small number of descriptive, numerical features, effectively serve as a “signature” for the network? Having such signatures will enable a wealth of graph mining and social network analysis tasks, including clustering, outlier detection, visualization, etc. We propose a novel, effective, and scalable method, called NetSimile, for solving the above problem. Our approach has the following desirable properties: (a) It is supported by a set of social theories. (b) It gives similarity scores that are size-invariant. (c) It is scalable, being linear on the number of links for graph signature extraction. In extensive experiments on numerous synthetic and real networks from disparate domains, NetSimile outperforms baseline competitors. We also demonstrate how our approach enables several mining tasks such as clustering, visualization, discontinuity detection, network transfer learning, and re-identification across networks.
Keywords :
data mining; graph theory; social networking (online); NetSimile; graph mining; graph signature extraction; k networks; multiple social theories; network similarity; social network analysis tasks; Educational institutions; Electronic mail; Feature extraction; Size measurement; Social network services; Vectors; Visualization;