DocumentCode :
2864789
Title :
Finding maximal frequent itemsets over online data streams adaptively
Author :
Lee, Daesu ; Lee, Wonsuk
Author_Institution :
Dept. of Comput. Sci., Yonsei Univ., South Korea
fYear :
2005
fDate :
27-30 Nov. 2005
Abstract :
Due to the characteristics of a data stream, it is very important to confine the memory usage of a data mining process regardless of the amount of information generated in the data stream. For this purpose, this paper proposes a CP-tree (compressed-prefix tree) that can be effectively used in finding either frequent or maximal frequent itemsets over an online data stream. Unlike a prefix tree, a node of a CP-tree can maintain the information of several item-sets together. Based on this characteristic, the size of a CP-tree can be flexibly controlled by merging or splitting nodes. In this paper, a mining method employing a CP-tree is proposed and an adaptive memory utilization scheme is also presented in order to maximize the mining accuracy of the proposed method for confined memory space at all times. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.
Keywords :
data mining; trees (mathematics); CP-tree; adaptive memory utilization; compressed-prefix tree; data mining; maximal frequent itemsets; mining method; online data streams; Buffer storage; Character generation; Computer science; Data analysis; Data mining; Itemsets; Merging; Performance analysis; Size control; Space technology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
ISSN :
1550-4786
Print_ISBN :
0-7695-2278-5
Type :
conf
DOI :
10.1109/ICDM.2005.68
Filename :
1565688
Link To Document :
بازگشت