• DocumentCode
    170888
  • Title

    A general framework of hybrid graph sampling for complex network analysis

  • Author

    Xin Xu ; Chul-Ho Lee ; Do Young Eun

  • Author_Institution
    Dept. of Electr. & Comput. Eng., North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2014
  • fDate
    April 27 2014-May 2 2014
  • Firstpage
    2795
  • Lastpage
    2803
  • Abstract
    Being able to capture the properties of massive real graphs and also greatly reduce data scale and processing complexity, graph sampling techniques provide an efficient tool for complex network analysis. Random walk-based sampling has become popular to obtain asymptotically uniform samples in the recent literature. However, it produces highly correlated samples and often leads to poor estimation accuracy in sampling large networks. Another widely-used approach is to launch random jump by querying randomly generated user/node ID, but also has the drawback of unexpected cost when the ID space is sparsely populated. In this paper, we develop a hybrid graph sampling framework that inherits the benefit of returning immediate samples from random walk-based crawling, while incorporating the advantage of reducing the correlation in the obtained samples from random jump. We aim to strike the right balance between random jump and crawling by analyzing the resulting asymptotic variance of an estimator of any graph nodal property, in order to give guidelines on the design of better graph sampling methods. We also provide simulation results on real network (graph) to confirm our theoretical findings.
  • Keywords
    data reduction; graph theory; network theory (graphs); random processes; sampling methods; social networking (online); ID space; asymptotic variance analysis; complex network analysis; data processing complexity; data scale reduction; graph nodal property; hybrid graph sampling; hybrid graph sampling framework; massive real graphs; random jump; random walk-based crawling; Complex networks; Computers; Correlation; Eigenvalues and eigenfunctions; Estimation; Markov processes; Sampling methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    INFOCOM, 2014 Proceedings IEEE
  • Conference_Location
    Toronto, ON
  • Type

    conf

  • DOI
    10.1109/INFOCOM.2014.6848229
  • Filename
    6848229