DocumentCode
162496
Title
Efficiently Finding Top-K Items from Evolving Distributed Data Streams
Author
Baoyuan Qi ; Gang Ma ; Zhongzhi Shi ; Wei Wang
Author_Institution
Key Lab. of Intell. Inf. Process., Inst. of Comput. Technol., Beijing, China
fYear
2014
fDate
27-29 Aug. 2014
Firstpage
137
Lastpage
140
Abstract
The problem of efficiently finding top-k frequent items has attracted much attention in recent years. Storage constraints in the processing node and intrinsic evolving feature of the data streams are two main challenges. In this paper, we propose a method to tackle these two challenges based on space-saving and gossip-based algorithms respectively. Our method is implemented on SAMOA, a scalable advanced massive online analysis machine learning framework. The experimental results show its effectiveness and scalability.
Keywords
data mining; learning (artificial intelligence); SAMOA framework; evolving distributed data streams; gossip-based algorithm; scalable advanced massive online analysis machine learning framework; space-saving algorithm; top-k frequent items; Data mining; Distributed databases; Machine learning algorithms; Monitoring; Peer-to-peer computing; Protocols; Radiation detectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantics, Knowledge and Grids (SKG), 2014 10th International Conference on
Conference_Location
Beijing
Type
conf
DOI
10.1109/SKG.2014.18
Filename
6964679
Link To Document