DocumentCode :
472416
Title :
Mining High Utility Itemsets in Large High Dimensional Data
Author :
Yu, Guangzhu ; Li, Keqing ; Shao, Shihuang
Author_Institution :
Donghua Univ., Shanghai
fYear :
2008
fDate :
23-24 Jan. 2008
Firstpage :
17
Lastpage :
20
Abstract :
Existing algorithms for utility mining are inadequate on datasets with high dimensions or long patterns. This paper proposes a hybrid method, which is composed of a row enumeration algorithm (i.e., inter-transaction) and a column enumeration algorithm (i.e., two-phase), to discover high utility itemsets from two directions: Two-phase seeks short high utility itemsets from the bottom, while inter-transaction seeks long high utility itemsets from the top. In addition, optimization technique is adopted to improve the performance of computing the intersection of transactions. Experiments on synthetic data show that the hybrid method achieves high performance in large high dimensional datasets.
Keywords :
data mining; optimisation; column enumeration algorithm; high dimensional data; optimization technique; row enumeration algorithm; utility itemset mining; Association rules; Data mining; Databases; Educational institutions; Itemsets; Partitioning algorithms; Terminology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge Discovery and Data Mining, 2008. WKDD 2008. First International Workshop on
Conference_Location :
Adelaide, SA
Print_ISBN :
978-0-7695-3090-1
Type :
conf
DOI :
10.1109/WKDD.2008.64
Filename :
4470341
Link To Document :
بازگشت