Title :
A novel decomposition algorithm for binary datatables: Encouraging results on discrimination tasks
Author :
Cadot, Martine ; Lelu, Alain
Author_Institution :
Dept. Inf., Univ. de Nancy, Nancy, France
Abstract :
We present here an algorithm for decomposing any binary datatable into a set of “sufficient itemsets”, i.e. a non-redundant list of itemsets adequate for reconstructing the whole table up to a permutation of the rows. For doing so, we have replaced the “support” threshold criterion of the well-known Apriori algorithm by a “number of liberties”: the liberty count expresses how a (k+1)-level itemset is constrained by its k-level “parents”, till the level when the situation turns frozen. Our algorithm is symmetric: we take into account the absence of items as well as their presence in our itemsets. Conversely, we present a method for reconstituting the original data starting from our exact MIDOVA representation. We illustrate these points with the examples of Breast Cancer and Mushroom datasets from UCI Repository. We validate our approach by deriving a learning classifier approach and applying it to three discrimination problems drawn from the above-mentioned repository.
Keywords :
Breast cancer; Data mining; Displays; Feature extraction; Itemsets; Kernel; Matrix decomposition; Social network services; Text mining; Web mining; association mining; classification; knowledge discover; learning classifier system; matrix decomposition; negative itemset;
Conference_Titel :
Research Challenges in Information Science (RCIS), 2010 Fourth International Conference on
Conference_Location :
Nice, France
Print_ISBN :
978-1-4244-4839-5
Electronic_ISBN :
2151-1349
DOI :
10.1109/RCIS.2010.5507364