DocumentCode
1925509
Title
Scalable link analysis algorithm for a distributed memory environment
Author
Arulraj, Joy James Prabhu
Author_Institution
DCSE, Anna Univ., Chennai, India
fYear
2010
fDate
28-30 Oct. 2010
Firstpage
61
Lastpage
66
Abstract
The convergence rate of iterative stationary methods for the PageRank linear system is significant for efficient computation of the relative rank of web pages based on the Web graph. This paper investigates this problem from the perspective of a message-passing framework on a distributed memory parallel programming model. A three-dimensional Web model is introduced to present a scalable link analysis algorithm that can compute the rank vector of a linear system with a high degree of convergence and accuracy. The algorithm derives its scalability by eliminating the need for any global information or page categorization. The details of the MPI-based parallel implementation and the experimental results obtained on a 8-node cluster of Intel Xeon 3.0 GHz, 1.5GB RAM machines over a 100MBit Ethernet LAN are provided. The variation in convergence behavior with residual probability is analyzed. For the WebBase crawl, a speedup of 12.45 is obtained by the scalable algorithm over the centralized algorithm.
Keywords
Web sites; convergence; graph theory; iterative methods; message passing; parallel programming; probability; search engines; Ethernet LAN; Intel Xeon; MPI; PageRank linear system; Web graph; Web pages ranking; WebBase crawl; centralized algorithm; convergence rate; distributed memory parallel programming model; frequency 3.0 GHz; iterative stationary method; memory size 1.5 GByte; message-passing framework; residual probability; scalable algorithm; scalable link analysis algorithm; three-dimensional Web model; Algorithm design and analysis; Clustering algorithms; Convergence; Grid computing; Iterative methods; Linear systems; Markov processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Distributed and Grid Computing (PDGC), 2010 1st International Conference on
Conference_Location
Solan
Print_ISBN
978-1-4244-7675-6
Type
conf
DOI
10.1109/PDGC.2010.5679603
Filename
5679603
Link To Document