Title :
A Two -- Stage Reservoir Sampling Algorithm for Massive Network Graphs
Author_Institution :
Fac. of Inf. Sci., Kyushu Sangyo Univ., Fukuoka, Japan
Abstract :
Independent and uniform random sampling from massive network graphs is still a great challenge due to the lack of a sample frame or a full list of all elements to be sampled from. The state -- of -- the -- art solution to this challenge is random walk with re -- weighting, where bias introduced by random walk is corrected by a re -- weighting process. The obtained sample however is both biased and may involves correlation. In this paper, we propose a reservoir -- based sampling algorithm to cope with the bias and correlation issues. At the first stage, we perform random sampling with replacement using a uniform reservoir first_res, which maintains a set of elements that may contain duplicates and biases. Then at the second stage, we do weighted reservoir sampling from distinct elements of first_res. In this process bias and correlation will be eliminated. We perform simulation -- driven experiments to evaluate the proposed algorithm and verified that it can produce uniform sampling results efficiently.
Keywords :
"Reservoirs","Correlation","Sociology","Arrays","Social network services","Peer-to-peer computing"
Conference_Titel :
Network-Based Information Systems (NBiS), 2015 18th International Conference on
DOI :
10.1109/NBiS.2015.20