DocumentCode
1926753
Title
Computing median values in a cloud environment using GridBatch and MapReduce
Author
Liu, Huan ; Orban, Dan
Author_Institution
Accenture Technol. Labs., San Jose, CA, USA
fYear
2009
fDate
Aug. 31 2009-Sept. 4 2009
Firstpage
1
Lastpage
10
Abstract
Traditional enterprise software is built around a dedicated high-performance infrastructure and it cannot map to an infrastructure cloud directly without a significant performance loss. Although MapReduce holds the promise as a viable approach, it lacks building blocks that enable high-performance optimization, especially in a shared infrastructure. Following on our previous work, we introduce another building block called the block level operator (BLO) and we show how it can be applied to solve a real enterprise application of finding the medians in a large data set. We propose two efficient approaches to compute medians, one using MapReduce and the other using the BLO. We compare the two approaches, as well as with that of using the traditional enterprise software stack, and show that our approach using the BLO gives an order of magnitude of improvement.
Keywords
commerce; grid computing; optimisation; GridBatch; MapReduce; block level operator; cloud environment; enterprise software; median values; optimization; Application software; Bandwidth; Cloud computing; Concurrent computing; Data analysis; Grid computing; Hardware; Large-scale systems; Network servers; Parallel programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing and Workshops, 2009. CLUSTER '09. IEEE International Conference on
Conference_Location
New Orleans, LA
ISSN
1552-5244
Print_ISBN
978-1-4244-5011-4
Electronic_ISBN
1552-5244
Type
conf
DOI
10.1109/CLUSTR.2009.5289194
Filename
5289194
Link To Document