DocumentCode :
2205372
Title :
SFMapReduce: An optimized MapReduce framework for Small Files
Author :
Fang Zhou ; Hai Pham ; Jianhui Yue ; Hao Zou ; Weikuan Yu
Author_Institution :
Auburn University, AL, 36849, USA
fYear :
2015
fDate :
6-7 Aug. 2015
Firstpage :
23
Lastpage :
32
Abstract :
Hadoop, an open-source implementation of MapReduce, is widely used because of its ease of programming, scalability, and availability. With the explosive development of cloud computing, business and scientific applications increasingly take advantage of Hadoop. The sizes of files stored and processed in Hadoop are not bound to very large files anymore. However, Hadoop cannot provide stable and efficient services for small files at both storage and processing levels. To solve these problems, we propose an optimized MapReduce framework for small files, SFMapReduce. In SFMapReduce, we present two techniques, Small File Layout (SFLayout) and customized MapReduce (CMR). SFLayout is used to solve the memory problem and improve I/O performance in HDFS. CMR provides an interface for MapReduce so that SFMapReduce can process MapReduce with SFLayout efficiently. Our experimental results show that SFMapReduce decreases the memory pressure on the Hadoop NameNode, and provides better loading and retrieving throughput. On average, SFMapReduce achieves an improvement on MapReduce processing by 14.5 times and 20.8 times, compared with the original Hadoop and HAR layout.
Keywords :
Containers; Heart beat; Indexes; Layout; Loading; Metadata; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networking, Architecture and Storage (NAS), 2015 IEEE International Conference on
Conference_Location :
Boston, MA, USA
Type :
conf
DOI :
10.1109/NAS.2015.7255218
Filename :
7255218
Link To Document :
بازگشت