DocumentCode
163259
Title
Improving performance of small-file accessing in Hadoop
Author
Vorapongkitipun, Chatuporn ; Nupairoj, Natawut
Author_Institution
Dept. of Comput. Eng., Chulalongkorn Univ., Bangkok, Thailand
fYear
2014
fDate
14-16 May 2014
Firstpage
200
Lastpage
205
Abstract
The Hadoop Distributed File System (HDFS) is an open source system which is designed to run on commodity hardware and is suitable for applications that have large data sets (terabytes). As HDFS architecture bases on single master (NameNode) to handle metadata management for multiple slaves (Datanode), NameNode often becomes bottleneck, especially when handling large number of small files. To maximize efficiency, NameNode stores the entire metadata of HDFS in its main memory. With too many small files, NameNode can be running out of memory. In this paper, we propose a mechanism based on Hadoop Archive (HAR), called New Hadoop Archive (NHAR), to improve the memory utilization for metadata and enhance the efficiency of accessing small files in HDFS. In addition, we also extend HAR capabilities to allow additional files to be inserted into the existing archive files. Our experiment results show that our approach can to improve the access efficiencies of small files drastically as it outperforms HAR up to 85.47%.
Keywords
distributed processing; file organisation; meta data; public domain software; Datanode; HAR capabilities; HDFS; HDFS architecture; Hadoop distributed file system; NHAR; NameNode; commodity hardware; large data sets; memory utilization; metadata management formultiple slaves; new Hadoop archive; open source system; small file access efficiency; small-file accessing performance improvement; HAR; HDFS; Hadoop; Hadoop Archive; Improve performance; Small files in Hadoop;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Software Engineering (JCSSE), 2014 11th International Joint Conference on
Conference_Location
Chon Buri
Print_ISBN
978-1-4799-5821-4
Type
conf
DOI
10.1109/JCSSE.2014.6841867
Filename
6841867
Link To Document