Title :
A Randomized Approach for Approximating the Number of Frequent Sets
Author :
Boley, Mario ; Grosskreutz, Henrik
Author_Institution :
Schloss Birlinghoven, Fraunhofer IAIS, Sankt Augustin
Abstract :
We investigate the problem of counting the number of frequent (item)sets - a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we empirically show that the algorithm still has the desired accuracy on real-world datasets when its running time is capped polynomially.
Keywords :
Markov processes; Monte Carlo methods; approximation theory; polynomials; set theory; Markov chain Monte Carlo method; approximation bound; frequent sets number; polynomial time computation; randomized approach; randomized counting algorithm; Approximation algorithms; Association rules; Data mining; Explosions; Frequency; Polynomials; approximation; counting; frequent itemset mining; randomized algorithms;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.85