Approximate frequent itemsets compression using dynamic clustering method

Author

Yan, Hua ; Sang, Yongsheng

Author_Institution

Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu

fYear

2008

fDate

21-24 Sept. 2008

Firstpage

1061

Lastpage

1066

Abstract

Frequent-itemsets mining often faces the problem of generating a large collection of frequent itemsets, which is too large to be carefully examined and understood by the users. To reduce the output size of frequent itemsets, we propose using a dynamic clustering method to compress the frequent itemsets approximately in this paper. Concretely, two frequent itemsets intra-cluster similarities, expression similarity and support similarity, are defined according to the specific requirements of frequent itemsets compression. Based on the above two similarity measures, the frequent itemsets clustering criterion and its related clustering algorithm are developed. Specially, our method has two features: 1)users neednpsilat specify the number of frequent itemsets clusters explicitly; 2)userpsilas expectation of compression ratio is incorporated. Our initial experimental results show that our approximate frequent itemsets method is feasible and the compression quality is good.

Keywords

data compression; data mining; pattern clustering; dynamic clustering method; expression similarity; frequent itemsets compression; frequent-itemsets mining; intra-cluster similarities; support similarity; Algorithm design and analysis; Clustering algorithms; Clustering methods; Computational intelligence; Computer science; Data mining; Greedy algorithms; Itemsets; Laboratories; Transaction databases;

fLanguage

English

Publisher

ieee

Conference_Titel

Cybernetics and Intelligent Systems, 2008 IEEE Conference on

Conference_Location

Chengdu

Print_ISBN

978-1-4244-1673-8

Electronic_ISBN

978-1-4244-1674-5

Type

conf

DOI

10.1109/ICCIS.2008.4670945

Filename

4670945