DocumentCode :
2403862
Title :
FAST: a new sampling-based algorithm for discovering association rules
Author :
Bin Chen ; Haas, Peter J. ; Scheuermann, Peter
fYear :
2002
fDate :
2002
Firstpage :
263
Abstract :
We present FAST (finding associations from sampled transactions), a refined sampling-based mining algorithm that is distinguished from prior algorithms by its novel two-phase approach to sample collection. In phase I a large sample is collected to quickly and accurately estimate the support of each item in the database. In phase II, a small final sample is obtained by excluding "outlier" transactions in such a manner that the support of each item in the final sample is as close as possible to the estimated support of the item in the entire database. We propose two approaches to obtaining the final sample in phase II: trimming and growing. The trimming procedure starts from the large initial sample and removes outlier transactions until a specified stopping criterion is satisfied. In contrast, the growing procedure selects representative transactions from the initial sample and adds them to an initially empty data set
Keywords :
data mining; database management systems; transaction processing; FAST; association rule discovery; finding associations from sampled transactions; growing; sample collection; sampling-based mining algorithm; trimming; Association rules; Frequency conversion; Frequency measurement; Itemsets; Measurement standards; Phase estimation; Sampling methods; Size measurement; Transaction databases; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location :
San Jose, CA
ISSN :
1063-6382
Print_ISBN :
0-7695-1531-2
Type :
conf
DOI :
10.1109/ICDE.2002.994717
Filename :
994717
Link To Document :
بازگشت