DocumentCode :
1857890
Title :
Location-Aware MapReduce in Virtual Cloud
Author :
Geng, Yifeng ; Chen, Shimin ; Wu, Yongwei ; Wu, Ryan ; Yang, Guangwen ; Zheng, Weimin
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2011
fDate :
13-16 Sept. 2011
Firstpage :
275
Lastpage :
284
Abstract :
MapReduce is an important programming model for processing and generating large data sets in parallel. It is commonly applied in applications such as web indexing, data mining, machine learning, etc. As an open-source implementation of MapReduce, Hadoop is now widely used in industry. Virtualization, which is easy to configure and economical to use, shows great potential for cloud computing. With the increasing core number in a CPU and involving of virtualization technique, one physical machine can hosts more and more virtual machines, but I/O devices normally do not increase so rapidly. As MapReduce system is often used to running I/O intensive applications, decreasing of data redundancy and load unbalance, which increase I/O interference in virtual cloud, come to be serious problems. This paper builds a model and defines metrics to analyze the data allocation problem in virtual environment theoretically. And we design a location-aware file block allocation strategy that retains compatibility with the native Hadoop. Our model simulation and experiment in real system shows our new strategy can achieve better data redundancy and load balance to reduce I/O interference. Execution time of applications such as RandomWriter, Text Sort and Word Count are reduced by up to 33% and 10% on average.
Keywords :
cloud computing; mobile computing; parallel processing; resource allocation; virtual machines; virtualisation; Hadoop; RandomWriter application; TextSort application; WordCount application; cloud computing; data allocation problem; location-aware MapReduce; parallel processing; virtual cloud; virtual machines; virtualization technique; Data models; Interference; Measurement; Resource management; Throughput; Virtual environments; Virtual machining; Data allocation; I/O interference; Load balance; MapReduce; Virtualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2011 International Conference on
Conference_Location :
Taipei City
ISSN :
0190-3918
Print_ISBN :
978-1-4577-1336-1
Electronic_ISBN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2011.40
Filename :
6047196
Link To Document :
بازگشت