Title :
A scheme of structured data compression and query on Hadoop platform
Author :
Xiangwu Ding ; Bo Tian ; Yefeng Li
Author_Institution :
Dept. of Comput. Sci. & Technol., Univ. of Donghua, Shanghai, China
Abstract :
We proposed a scheme of data compression and query technology to improve the performance of processing structured data on Hadoop platform. Firstly, we designed a data page structure for row-column hybrid storage based on HDFS. Then we proposed and implemented an adaptive lightweight data compression strategy based on MapReduce to compress and store data as the proposed storage structure. Finally, we provided a query strategy which directly execute on the compressed data of the given storage structure. The experiments conducted on the large-scale datasets demonstrated the effectiveness of the proposed strategy on reducing the amount of storage and improving query performance for structured data.
Keywords :
data handling; data structures; HDFS; Hadoop platform; MapReduce; adaptive lightweight data compression strategy; data page structure; query performance; query strategy; query technology; row-column hybrid storage; storage structure; structured data compression; Big data; Compression algorithms; Data analysis; Data compression; Dictionaries; Encoding; Query processing;
Conference_Titel :
Digital Information, Networking, and Wireless Communications (DINWC), 2015 Third International Conference on
Conference_Location :
Moscow
Print_ISBN :
978-1-4799-6375-1
DOI :
10.1109/DINWC.2015.7054235