Title :
Using Frequent Closed Itemsets for Data Dimensionality Reduction
Author :
Krajca, Petr ; Outrata, Jan ; Vychodil, Vilem
Author_Institution :
Dept. Comput. Sci., Palacky Univ., Olomouc, Czech Republic
Abstract :
We address important issues of dimensionality reduction of transactional data sets where the input data consists of lists of transactions, each of them being a finite set of items. The reduction consists in finding a small set of new items, so-called factor-items, which is considerably smaller than the original set of items while comprising full or nearly full information about the original items. Using this type of reduction, the original data set can be represented by a smaller transactional data set using factor-items instead of the original items, thus reducing its dimensionality. The procedure utilized in this paper is based on approximate Boolean matrix decomposition. In this paper, we focus on the role of frequent closed item sets that can be used to determine factor-items. We present the factorization problem, its reduction to Boolean matrix decompositions, experiments with publicly available data sets, and an algorithm for computing decompositions.
Keywords :
Boolean functions; matrix decomposition; boolean matrix decomposition; data dimensionality reduction; factorization problem; frequent closed itemsets; transactional data sets; Approximation algorithms; Approximation methods; Arrays; Data mining; Itemsets; Matrix decomposition; Boolean matrices; dimensionality reduction; frequent closed itemsets; set covering;
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
Print_ISBN :
978-1-4577-2075-8
DOI :
10.1109/ICDM.2011.154