Title :
Go green: recycle and reuse frequent patterns
Author :
Cong, Gao ; Ooi, Beng Chin ; Tan, Kian-Lee ; Tung, Anthony K H
Author_Institution :
Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore
fDate :
30 March-2 April 2004
Abstract :
In constrained data mining, users can specify constraints to prune the search space to avoid mining uninteresting knowledge. This is typically done by specifying some initial values of the constraints that are subsequently refined iteratively until satisfactory results are obtained. Existing mining schemes treat each iteration as a distinct mining process, and fail to exploit the information generated between iterations. We propose to salvage knowledge that is discovered from an earlier iteration of mining to enhance subsequent rounds of mining. In particular, we look at how frequent patterns can be recycled. Our proposed strategy operates in two phases. In the first phase, frequent patterns obtained from an early iteration are used to compress a database. In the second phase, subsequent mining processes operate on the compressed database. We propose two compression strategies and adapt three existing frequent pattern mining techniques to exploit the compressed database. Results from our extensive experimental study show that our proposed recycling algorithms outperform their nonrecycling counterpart by an order of magnitude.
Keywords :
data mining; database management systems; compressed database; constrained data mining; frequent pattern mining technique; frequent patterns recycling; frequent patterns reuse; nonrecycling counterpart; recycling algorithm; search space; Data engineering; Recycling;
Conference_Titel :
Data Engineering, 2004. Proceedings. 20th International Conference on
Print_ISBN :
0-7695-2065-0
DOI :
10.1109/ICDE.2004.1319990