• DocumentCode
    2484178
  • Title

    CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters

  • Author

    Rafique, M. Mustafa ; Rose, Benjamin ; Butt, Ali R. ; Nikolopoulos, Dimitrios S.

  • Author_Institution
    Dept. of Comput. Sci., Virginia Tech., Blacksburg, VA, USA
  • fYear
    2009
  • fDate
    23-29 May 2009
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    The use of asymmetric multi-core processors with on-chip computational accelerators is becoming common in a variety of environments ranging from scientific computing to enterprise applications. The focus of current research has been on making efficient use of individual systems, and porting applications to asymmetric processors. In this paper, we take the next step by investigating the use of multi-core-based systems, especially the popular Cell processor, in a cluster setting. We present CellMR, an efficient and scalable implementation of the MapReduce framework for asymmetric Cell-based clusters. The novelty of CellMR lies in its adoption of a streaming approach to supporting MapReduce, and its adaptive resource scheduling schemes: Instead of allocating workloads to the components once, CellMR slices the input into small work units and streams them to the asymmetric nodes for efficient processing. Moreover, CellMR removes I/O bottlenecks by design, using a number of techniques, such as double-buffering and asynchronous I/O, to maximize cluster performance. Our evaluation of CellMR using typical MapReduce applications shows that it achieves 50.5% better performance compared to the standard nonstreaming approach, introduces a very small overhead on the manager irrespective of application input size, scales almost linearly with increasing number of compute nodes (a speedup of 6.9 on average, when using eight nodes compared to a single node), and adapts effectively the parameters of its resource management policy between applications with varying computation density.
  • Keywords
    microprocessor chips; multiprocessing systems; resource allocation; scheduling; workstation clusters; Cell processor; CellMR; MapReduce; adaptive resource scheduling; asymmetric Cell-based cluster; asymmetric cell-based cluster; asymmetric multicore processor; asynchronous I/O; double buffering; enterprise application; environment ranging; on-chip computational accelerator; scientific computing; Acceleration; Computer architecture; Data processing; Distributed computing; High performance computing; Multicore processing; Parallel architectures; Parallel programming; Processor scheduling; Resource management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
  • Conference_Location
    Rome
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-3751-1
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2009.5161062
  • Filename
    5161062