Title :
Compressed Hierarchical Mining of Frequent Closed Patterns from Dense Data Sets
Author :
Ji, Liping ; Tan, Kian-Lee ; Tung, Anthony K H
Author_Institution :
Nat. Univ. of Singapore, Singapore
Abstract :
This paper addresses the problem of finding frequent closed patterns (FCPs) from very dense data sets. We introduce two compressed hierarchical FCP mining algorithms: C-Miner and B-Miner. The two algorithms compress the original mining space, hierarchically partition the whole mining task into independent subtasks, and mine each subtask progressively. The two algorithms adopt different task partitioning strategies: C-Miner partitions the mining task based on Compact Matrix Division, whereas B-Miner partitions the task based on Base Rows Projection. The compressed hierarchical mining algorithms enhance the mining efficiency and facilitate a progressive refinement of results. Moreover, because the subtasks can be mined independently, C-Miner and B-Miner can be readily paralleled without incurring significant communication overhead. We have implemented C-Miner and B-Miner, and our performance study on synthetic data sets and real dense microarray data sets shows their effectiveness over existing schemes. We also report experimental results on parallel versions of these two methods.
Keywords :
data mining; pattern recognition; B-Miner partitions; Base Rows Projection; C-Miner; Compact Matrix Division; compressed hierarchical mining; dense data sets; dense microarray data sets; finding frequent closed patterns; significant communication; synthetic data sets; task partitioning; Association rules; Computer Society; Data analysis; Data mining; Gene expression; Partitioning algorithms; Pattern analysis; Frequent closed patterns; data mining; dense datasets; parallel mining; progressive;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2007.1047