• DocumentCode
    2194581
  • Title

    DELMA: Dynamically ELastic MapReduce Framework for CPU-Intensive Applications

  • Author

    Fadika, Zacharia ; Govindaraju, Madhusudhan

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York (SUNY) at Binghamton, Binghamton, NY, USA
  • fYear
    2011
  • fDate
    23-26 May 2011
  • Firstpage
    454
  • Lastpage
    463
  • Abstract
    Since its introduction, MapReduce implementations have been primarily focused towards static compute cluster sizes. In this paper, we introduce the concept of dynamic elasticity to MapReduce. We present the design decisions and implementation tradeoffs for DELMA, (Dynamically Elastic MapReduce), a framework that follows the MapReduce paradigm, just like Hadoop MapReduce, but that is capable of growing and shrinking its cluster size, as jobs are underway. In our study, we test DELMA in diverse performance scenarios, ranging from diverse node additions to node additions at various points in the application run-time with various dataset sizes. The applicability of the MapReduce paradigm extends far beyond its use with large-scale data intensive applications, and can also be brought to bear in processing long running distributed applications executing on small-sized clusters. In this work, we focus both on the performance of processing hierarchical data in distributed scientific applications, as well as the processing of smaller but demanding input sizes primarily used in small clusters. We run experiments for datasets that require CPU intensive processing, ranging in size from Millions of input data elements to process, up to over half a billion elements, and observe the positive scalability patterns exhibited by the system. We show that for such sizes, performance increases accordingly with data and cluster size increases. We conclude on the benefits of providing MapReduce with the capability of dynamically growing and shrinking its cluster configuration by adding and removing nodes during jobs, and explain the possibilities presented by this model.
  • Keywords
    cloud computing; functional programming; multiprocessing systems; CPU; DELMA; Hadoop MapReduce; data intensive applications; distributed processing; dynamically elastic MapReduce; intensive processing; scalability patterns; Cloud computing; Computational modeling; Distance measurement; Fault tolerance; Fault tolerant systems; Peer to peer computing; Transient analysis; Cloud Computing; DELMA; Hadoop; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on
  • Conference_Location
    Newport Beach, CA
  • Print_ISBN
    978-1-4577-0129-0
  • Electronic_ISBN
    978-0-7695-4395-6
  • Type

    conf

  • DOI
    10.1109/CCGrid.2011.71
  • Filename
    5948636