DocumentCode :
2991559
Title :
OLAP Aggregation Based on Dimension-oriented Storage
Author :
Jing-hua, Zhao ; Ai-mei, Song ; Ai-bo, Song
Author_Institution :
Coll. of Inf. Sci. & Eng., Shandong Univ. of Sci. & Technol., Qingdao, China
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
1932
Lastpage :
1936
Abstract :
OLAP (online analytical processing) applications are based on a variety of aggregate queries on large-scale data. As aggregation is always performed on columns, traditional row-oriented storage, in which all the columns of a data row are stored together, has seriously restricted its performance. This paper proposes a dimension-oriented storage model based on HBase, and a new parallel aggregation technique, which accomplishes aggregation operations with parallel MapReduce jobs. Finally, compared with Hive on standard TPC-H data set, our technique is demonstrated to improve performance of core aggregate operations significantly.
Keywords :
data mining; parallel processing; query processing; storage management; HBase; Hive; OLAP aggregation; TPC-H data set; aggregate queries; aggregation operations; dimension-oriented storage model; large-scale data; online analytical processing applications; parallel MapReduce jobs; parallel aggregation technique; row-oriented storage; MapReduce; OLAP (online analytical processing); aggregation; dimensionoriented storage;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
Type :
conf
DOI :
10.1109/IPDPSW.2012.241
Filename :
6270398
Link To Document :
بازگشت