• DocumentCode
    1926753
  • Title

    Computing median values in a cloud environment using GridBatch and MapReduce

  • Author

    Liu, Huan ; Orban, Dan

  • Author_Institution
    Accenture Technol. Labs., San Jose, CA, USA
  • fYear
    2009
  • fDate
    Aug. 31 2009-Sept. 4 2009
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Traditional enterprise software is built around a dedicated high-performance infrastructure and it cannot map to an infrastructure cloud directly without a significant performance loss. Although MapReduce holds the promise as a viable approach, it lacks building blocks that enable high-performance optimization, especially in a shared infrastructure. Following on our previous work, we introduce another building block called the block level operator (BLO) and we show how it can be applied to solve a real enterprise application of finding the medians in a large data set. We propose two efficient approaches to compute medians, one using MapReduce and the other using the BLO. We compare the two approaches, as well as with that of using the traditional enterprise software stack, and show that our approach using the BLO gives an order of magnitude of improvement.
  • Keywords
    commerce; grid computing; optimisation; GridBatch; MapReduce; block level operator; cloud environment; enterprise software; median values; optimization; Application software; Bandwidth; Cloud computing; Concurrent computing; Data analysis; Grid computing; Hardware; Large-scale systems; Network servers; Parallel programming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and Workshops, 2009. CLUSTER '09. IEEE International Conference on
  • Conference_Location
    New Orleans, LA
  • ISSN
    1552-5244
  • Print_ISBN
    978-1-4244-5011-4
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2009.5289194
  • Filename
    5289194