• DocumentCode
    1638832
  • Title

    GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications

  • Author

    Liu, Huan ; Orban, Dan

  • Author_Institution
    Accenture Technol. Labs., Bangalore
  • fYear
    2008
  • Firstpage
    295
  • Lastpage
    305
  • Abstract
    To be competitive, enterprises are collecting and analyzing increasingly large amount of data in order to derive business insights. However, there are at least two challenges to meet the increasing demand. First, the growth in the amount of data far outpaces the computation power growth of a uniprocessor. The growing gap between the supply and demand of computation power forces Enterprises to parallelize their application code. Unfortunately, parallel programming is both time-consuming and error-prone. Second, the emerging Cloud Computing paradigm imposes constraints on the underlying infrastructure, which forces enterprises to rethink their application architecture. We propose the GridBatch system, which aims at solving large-scale data-intensive batch problems under the Cloud infrastructure constraints. GridBatch is a programming model and associated library that hides the complexity of parallel programming, yet it gives the users complete control on how data are partitioned and how computation is distributed so that applications can have the highest performance possible. Through a real client example, we show that GridBatch achieves high performance in Amazon´s EC2 computing Cloud.
  • Keywords
    batch processing (computers); grid computing; parallel programming; Cloud Computing; EC2 computing Cloud; GridBatch system; large-scale data-intensive batch applications; large-scale data-intensive batch problems; parallel programming; Cloud computing; Computer architecture; Concurrent computing; Distributed computing; Grid computing; High performance computing; Large-scale systems; Libraries; Parallel programming; Supply and demand; Amazon; Cloud Computing; EC2; GridBatch; MapReduce; S3;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and the Grid, 2008. CCGRID '08. 8th IEEE International Symposium on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-0-7695-3156-4
  • Electronic_ISBN
    978-0-7695-3156-4
  • Type

    conf

  • DOI
    10.1109/CCGRID.2008.30
  • Filename
    4534231