مرکز منطقه ای اطلاع رساني علوم و فناوري - CHAC: An Effective Attribute Clustering Algorithm for Large-Scale Data Processing

DocumentCode :

3437021

Title :

CHAC: An Effective Attribute Clustering Algorithm for Large-Scale Data Processing

Author :

Gu, Xiaoyan ; Yang, Xiufeng ; Wang, Weiping ; Jin, Yan ; Meng, Dan

Author_Institution :

Inst. of Comput. Technol., Beijing, China

fYear :

2012

fDate :

28-30 June 2012

Firstpage :

Lastpage :

Abstract :

Nowadays Hadoop has become a leading architecture for large-scale data processing. One of the efficient ways to accelerate data processing is column-oriented storage technique which has been integrated into Hadoop family recently. However, how to design an appropriate attribute clustering algorithm to achieve optimal data processing performance for column-oriented hadoop system is still a big problem. In this paper, we propose a novel algorithm called CHAC to solve this problem. Both cases of overlapping attribute cluster and non-overlapping attribute cluster are considered in CHAC. In addition, an adjustable parameter is also taken into account to prohibit excessive attribute redundancy via limiting space overhead. The experimental results on TPC-H Benchmark demonstrate the efficiency and effectiveness of the proposed algorithm.

Keywords :

data handling; pattern clustering; query processing; storage management; CHAC; TPC-H Benchmark; attribute clustering algorithm; attribute redundancy; column-oriented Hadoop system; column-oriented storage technique; data processing acceleration; large-scale data processing; limiting space overhead; nonoverlapping attribute cluster; optimal data processing performance; query execution time; Algorithm design and analysis; Clustering algorithms; Data models; Database systems; Itemsets; Partitioning algorithms; Attribute Clustering; CHAC; Hadoop; Overlapping;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Networking, Architecture and Storage (NAS), 2012 IEEE 7th International Conference on

Conference_Location :

Xiamen, Fujian

Print_ISBN :

978-1-4673-1889-1

Type :

conf

DOI :

10.1109/NAS.2012.16

Filename :

6310881

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3437021