DocumentCode :
2335170
Title :
H-mine: hyper-structure mining of frequent patterns in large databases
Author :
Pei, Jian ; Han, Jiawei ; Lu, Hongjun ; Nishio, Shojiro ; Tang, Shiwei ; Yang, Dongqing
Author_Institution :
Peking Univ., Beijing, China
fYear :
2001
fDate :
2001
Firstpage :
441
Lastpage :
448
Abstract :
Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter some performance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs. short patterns, memory-based vs. disk-based, etc. In this study, we propose a simple and novel hyper-linked data structure, H-struct and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has very limited and precisely predictable space overhead and runs really fast in memory-based setting. Moreover it can be scaled up to very large databases by database partitioning, and when the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases. This study also proposes a new data mining methodology, space-preserving mining, which may have strong impact in the future development of efficient and scalable data mining methods
Keywords :
data mining; data structures; database theory; very large databases; FP-trees; H-mine; H-struct; dynamic link adjustment; frequent patterns; hyper-structure mining; hyperlinked data structure; large databases; memory-based setting; space overhead; space-preserving mining; Assembly; Clustering algorithms; Counting circuits; Data analysis; Data mining; Data structures; Frequency; Iterative algorithms; Partitioning algorithms; Spatial databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
0-7695-1119-8
Type :
conf
DOI :
10.1109/ICDM.2001.989550
Filename :
989550
Link To Document :
بازگشت