DocumentCode :
3724245
Title :
A MapReduce Optimization Method on Hadoop Cluster
Author :
Xiaodong Wu
Author_Institution :
Fujian Provincial Key Lab. of Data Intensive Comput. Key Lab. of Intell. Comput. &
fYear :
2015
Firstpage :
18
Lastpage :
21
Abstract :
The MapReduce parallel and distributed computing framework has been widely applied in both academia and industry. MapReduce applications are divided into two steps: Map and Reduce. Then, the input data is divided into splits, which can be concurrently processed, and the amount of the splits determines the number of map tasks. In this paper, we present a regression-based method to compute the number of Map tasks as well as Reduce tasks such that the performance of the MapReduce application can be improved. The regression analysis is used to predict the executing time of MapReduce applications. Experimental results show that the proposed optimization method can effectively reduce the execution time of the applications.
Keywords :
"Optimization methods","Hardware","Distributed databases","Programming","Data models","Parallel processing"
Publisher :
ieee
Conference_Titel :
Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICIICII.2015.92
Filename :
7373780
Link To Document :
بازگشت