DocumentCode
65724
Title
A Measurement Framework for Directed Networks
Author
Salehi, Marzieh ; Rabiee, Hamid R.
Author_Institution
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
Volume
31
Issue
6
fYear
2013
fDate
Jun-13
Firstpage
1007
Lastpage
1016
Abstract
Partially-observed network data collected by link-tracing based sampling methods is often being studied to obtain the characteristics of a large complex network. However, little attention has been paid to sampling from directed networks such as WWW and Peer-to-Peer networks. In this paper, we propose a novel two-step (sampling/estimation) framework to measure nodal characteristics which can be defined by an average target function in an arbitrary directed network. To this end, we propose a personalized PageRank-based algorithm to visit and sample nodes. This algorithm only uses already visited nodes as local information without any prior knowledge about the latent structure of the network. Moreover, we introduce a new estimator based on the approximate importance sampling to estimate average target functions. The proposed estimator utilizes calculated PageRank value of each sampled node as an approximation for the exact visiting probability. To the best of our knowledge, this is the first study on correcting the bias of a sampling method by re-weighting of measured values that considers the effect of approximation of visiting probabilities. Comprehensive theoretical and empirical analysis of the estimator demonstrate that it is asymptotically unbiased even in situations where stationary distribution of PageRank is poorly approximated.
Keywords
Internet; estimation theory; importance sampling; probability; search engines; PageRank value; World Wide Web; approximate importance sampling; average target function estimation; complex network; directed network; estimator; link-tracing based sampling method; measured value reweighting; measurement framework; network latent structure; nodal characteristics; peer-to-peer network; personalized PageRank-based algorithm; visiting probability; Directed Networks; Estimation; Link-tracing Sampling; PageRank;
fLanguage
English
Journal_Title
Selected Areas in Communications, IEEE Journal on
Publisher
ieee
ISSN
0733-8716
Type
jour
DOI
10.1109/JSAC.2013.130603
Filename
6517105
Link To Document