Title :
Frequent items mining on data stream using hash-table and heap
Author :
Shan, Zhang ; Ling, Chen ; Li, Tu
Author_Institution :
Dept. of Comput. Sci., Yang Zhou Univ., Yangzhou, China
Abstract :
Most of the existing algorithms for mining frequent items on data stream do not emphasis the importance of the recent data items. We present an algorithm to detect the items with frequency counts exceeding a user-specified threshold. Our algorithm uses a hash table L and a heap to record the potential frequent items, and can detect ¿-approximate frequent data items on data stream using O(|L|+ ¿-1) memory space and the processing time for each data item is O(log¿-1). Experimental results on several artificial and real datasets show our algorithm has higher precision, requires less memory and consumes less computation time than other similar methods.
Keywords :
computational complexity; data mining; data stream; frequent items mining; hash-table; ¿-approximate frequent data items; Area measurement; Computer errors; Computer science; Data mining; Extraterrestrial measurements; Fading; Frequency; Information science; Sampling methods; Space technology; data mining; data stream; frequent items; hash table; heap; time fading model;
Conference_Titel :
Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-4754-1
Electronic_ISBN :
978-1-4244-4738-1
DOI :
10.1109/ICICISYS.2009.5357918