Title :
A general framework of hybrid graph sampling for complex network analysis
Author :
Xin Xu ; Chul-Ho Lee ; Do Young Eun
Author_Institution :
Dept. of Electr. & Comput. Eng., North Carolina State Univ., Raleigh, NC, USA
fDate :
April 27 2014-May 2 2014
Abstract :
Being able to capture the properties of massive real graphs and also greatly reduce data scale and processing complexity, graph sampling techniques provide an efficient tool for complex network analysis. Random walk-based sampling has become popular to obtain asymptotically uniform samples in the recent literature. However, it produces highly correlated samples and often leads to poor estimation accuracy in sampling large networks. Another widely-used approach is to launch random jump by querying randomly generated user/node ID, but also has the drawback of unexpected cost when the ID space is sparsely populated. In this paper, we develop a hybrid graph sampling framework that inherits the benefit of returning immediate samples from random walk-based crawling, while incorporating the advantage of reducing the correlation in the obtained samples from random jump. We aim to strike the right balance between random jump and crawling by analyzing the resulting asymptotic variance of an estimator of any graph nodal property, in order to give guidelines on the design of better graph sampling methods. We also provide simulation results on real network (graph) to confirm our theoretical findings.
Keywords :
data reduction; graph theory; network theory (graphs); random processes; sampling methods; social networking (online); ID space; asymptotic variance analysis; complex network analysis; data processing complexity; data scale reduction; graph nodal property; hybrid graph sampling; hybrid graph sampling framework; massive real graphs; random jump; random walk-based crawling; Complex networks; Computers; Correlation; Eigenvalues and eigenfunctions; Estimation; Markov processes; Sampling methods;
Conference_Titel :
INFOCOM, 2014 Proceedings IEEE
Conference_Location :
Toronto, ON
DOI :
10.1109/INFOCOM.2014.6848229