Title :
Evaluating data storage structures of MapReduce
Author :
Haiming Lai ; Ming Xu ; Jian Xu ; Yizhi Ren ; Ning Zheng
Author_Institution :
Coll. of Comput., Hangzhou Dianzi Univ., Hangzhou, China
Abstract :
MapReduce framework and its open-source implementation Hadoop, a scalable and fault-tolerant infrastructure for big data analysis on large clusters, can achieve different performance with different data storage structures. This paper evaluates the performance about three kinds of data storage structures of MapReduce, namely row-store, column-store, and RCFile. The evaluating experiments are designed to test three data storage structures in terms of data loading time, data storage space, and query execution time. The experimental results show that RCFile data storage structure can achieve better performance in most cases.
Keywords :
data analysis; fault tolerant computing; public domain software; storage management; MapReduce framework; RCFile; big data analysis; column-store; data loading time; data storage space; data storage structure evaluation; fault-tolerant infrastructure; large clusters; open-source implementation Hadoop; query execution time; row-store; Open source software; Weaving; MapReduce; RCFile; column-store; data storage structure; row-stor;
Conference_Titel :
Computer Science & Education (ICCSE), 2013 8th International Conference on
Conference_Location :
Colombo
Print_ISBN :
978-1-4673-4464-7
DOI :
10.1109/ICCSE.2013.6554067