Title :
Computing median values in a cloud environment using GridBatch and MapReduce
Author :
Liu, Huan ; Orban, Dan
Author_Institution :
Accenture Technol. Labs., San Jose, CA, USA
fDate :
Aug. 31 2009-Sept. 4 2009
Abstract :
Traditional enterprise software is built around a dedicated high-performance infrastructure and it cannot map to an infrastructure cloud directly without a significant performance loss. Although MapReduce holds the promise as a viable approach, it lacks building blocks that enable high-performance optimization, especially in a shared infrastructure. Following on our previous work, we introduce another building block called the block level operator (BLO) and we show how it can be applied to solve a real enterprise application of finding the medians in a large data set. We propose two efficient approaches to compute medians, one using MapReduce and the other using the BLO. We compare the two approaches, as well as with that of using the traditional enterprise software stack, and show that our approach using the BLO gives an order of magnitude of improvement.
Keywords :
commerce; grid computing; optimisation; GridBatch; MapReduce; block level operator; cloud environment; enterprise software; median values; optimization; Application software; Bandwidth; Cloud computing; Concurrent computing; Data analysis; Grid computing; Hardware; Large-scale systems; Network servers; Parallel programming;
Conference_Titel :
Cluster Computing and Workshops, 2009. CLUSTER '09. IEEE International Conference on
Conference_Location :
New Orleans, LA
Print_ISBN :
978-1-4244-5011-4
Electronic_ISBN :
1552-5244
DOI :
10.1109/CLUSTR.2009.5289194