DocumentCode :
3576335
Title :
Itemset approximation using Constrained Binary Matrix Factorization
Author :
Mirisaee, Seyed Hamid ; Gaussier, Eric ; Termier, Alexandre
Author_Institution :
Univ. Grenoble Alps, Grenoble, France
fYear :
2014
Firstpage :
39
Lastpage :
45
Abstract :
We address in this paper the problem of efficiently finding a few number of representative frequent itemsets in transaction matrices. To do so, we propose to rely on matrix decomposition techniques, and more precisely on Constrained Binary Matrix Factorization (CBMF) which decomposes a given binary matrix into the product of two lower dimensional binary matrices, called factors. We first show, under binary constraints, that one can interpret the first factor as a transaction matrix operating on packets of items, whereas the second factor indicates which item belongs to which packet. We then formally prove that one can directly mine the CBMF factors in order to find (approximate) itemsets of a given size and support in the original transaction matrix. Then through a detailed experimental study, we show that the frequent itemsets produced by our method represent a significant portion of the set of all frequent itemsets according to existing metrics, while being up to several orders of magnitude less numerous.
Keywords :
approximation theory; data mining; matrix decomposition; CBMF factors; binary constraints; constrained binary matrix factorization; frequent itemsets; itemset approximation; lower-dimensional binary matrices; matrix decomposition techniques; pattern mining; transaction matrices; Algorithm design and analysis; Approximation algorithms; Approximation methods; Data mining; Itemsets; Matrix decomposition; Measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2014 International Conference on
Type :
conf
DOI :
10.1109/DSAA.2014.7058049
Filename :
7058049
Link To Document :
بازگشت