DocumentCode :
42585
Title :
PrIter: A Distributed Framework for Prioritizing Iterative Computations
Author :
Yanfeng Zhang ; Qixin Gao ; Lixin Gao ; Cuirong Wang
Author_Institution :
Comput. Center, Northeastern Univ., Shenyang, China
Volume :
24
Issue :
9
fYear :
2013
fDate :
Sept. 2013
Firstpage :
1884
Lastpage :
1893
Abstract :
Iterative computations are pervasive among data analysis applications, including web search, online social network analysis, recommendation systems, and so on. These applications typically involve data sets of massive scale. Fast convergence of the iterative computations on the massive data set is essential for these applications. In this paper, we explore the opportunity for accelerating iterative computations by prioritization. Instead of performing computations on all data points without discrimination, we prioritize the computations that help convergence the most, so that the convergence speed of iterative process is significantly improved. We develop a distributed computing framework, PrIter, which supports the prioritized execution of iterative computations. PrIter either stores intermediate data in memory for fast convergence or stores intermediate data in files for scaling to larger data sets. We evaluate PrIter on a local cluster of machines as well as on Amazon EC2 Cloud. The results show that PrIter achieves up to 50 × speedup over Hadoop for a series of iterative algorithms. In addition, PrIter is shown better performance for iterative computations than other state-of-the-art distributed frameworks such as Spark and Piccolo.
Keywords :
cloud computing; data analysis; distributed processing; iterative methods; Amazon EC2 cloud; Hadoop; Piccolo framework; PrIter distributed computing framework; Spark framework; data analysis application; iterative algorithm; iterative computation prioritization; iterative process; Cloud computing; Convergence; Couplings; Educational institutions; Iterative methods; Prediction algorithms; Vectors; MapReduce; PrIter; distributed framework; iterative algorithms; prioritized iteration;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2012.272
Filename :
6302132
Link To Document :
بازگشت