• DocumentCode
    42585
  • Title

    PrIter: A Distributed Framework for Prioritizing Iterative Computations

  • Author

    Yanfeng Zhang ; Qixin Gao ; Lixin Gao ; Cuirong Wang

  • Author_Institution
    Comput. Center, Northeastern Univ., Shenyang, China
  • Volume
    24
  • Issue
    9
  • fYear
    2013
  • fDate
    Sept. 2013
  • Firstpage
    1884
  • Lastpage
    1893
  • Abstract
    Iterative computations are pervasive among data analysis applications, including web search, online social network analysis, recommendation systems, and so on. These applications typically involve data sets of massive scale. Fast convergence of the iterative computations on the massive data set is essential for these applications. In this paper, we explore the opportunity for accelerating iterative computations by prioritization. Instead of performing computations on all data points without discrimination, we prioritize the computations that help convergence the most, so that the convergence speed of iterative process is significantly improved. We develop a distributed computing framework, PrIter, which supports the prioritized execution of iterative computations. PrIter either stores intermediate data in memory for fast convergence or stores intermediate data in files for scaling to larger data sets. We evaluate PrIter on a local cluster of machines as well as on Amazon EC2 Cloud. The results show that PrIter achieves up to 50 × speedup over Hadoop for a series of iterative algorithms. In addition, PrIter is shown better performance for iterative computations than other state-of-the-art distributed frameworks such as Spark and Piccolo.
  • Keywords
    cloud computing; data analysis; distributed processing; iterative methods; Amazon EC2 cloud; Hadoop; Piccolo framework; PrIter distributed computing framework; Spark framework; data analysis application; iterative algorithm; iterative computation prioritization; iterative process; Cloud computing; Convergence; Couplings; Educational institutions; Iterative methods; Prediction algorithms; Vectors; MapReduce; PrIter; distributed framework; iterative algorithms; prioritized iteration;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2012.272
  • Filename
    6302132