DocumentCode :
2542070
Title :
Finding frequent items in data streams using hierarchical information
Author :
Wang, Xiaoyu ; Liu, Hongyan ; Han, Jiawei
Author_Institution :
Tsinghua Univ., Beijing
fYear :
2007
fDate :
7-10 Oct. 2007
Firstpage :
431
Lastpage :
436
Abstract :
Finding frequent items or top-k items in data streams is a basic mining task with a wide range of applications. There are lots of algorithms proposed to enhance the performance of these algorithms, whereas not much effort has been made to make use of hierarchical information held by items in data stream. In this paper, we try to improve the accuracy of finding frequent items using hierarchical information in taxonomy. To do that, we propose a method called Merge. According to the strategy, we design and implement an algorithm, named FISHMerge. In order to evaluate the performance of the algorithm, we propose three new measures for testing, and develop a hierarchical stream data generator. After conducting a comprehensive experimental study, we conclude that accuracy of FISHMerge is better than algorithms without using hierarchical information under same amount of memory. In the meantime, our algorithm can also provide some information of higher level items.
Keywords :
data mining; merging; tree data structures; FISHMerge algorithm; data streams; frequent item finding; hierarchical information; mining task; taxonomy tree; Algorithm design and analysis; Data structures; Error analysis; Error correction; Filters; Frequency conversion; Sampling methods; Taxonomy; Telephony; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-0990-7
Electronic_ISBN :
978-1-4244-0991-4
Type :
conf
DOI :
10.1109/ICSMC.2007.4413754
Filename :
4413754
Link To Document :
بازگشت