• DocumentCode
    162496
  • Title

    Efficiently Finding Top-K Items from Evolving Distributed Data Streams

  • Author

    Baoyuan Qi ; Gang Ma ; Zhongzhi Shi ; Wei Wang

  • Author_Institution
    Key Lab. of Intell. Inf. Process., Inst. of Comput. Technol., Beijing, China
  • fYear
    2014
  • fDate
    27-29 Aug. 2014
  • Firstpage
    137
  • Lastpage
    140
  • Abstract
    The problem of efficiently finding top-k frequent items has attracted much attention in recent years. Storage constraints in the processing node and intrinsic evolving feature of the data streams are two main challenges. In this paper, we propose a method to tackle these two challenges based on space-saving and gossip-based algorithms respectively. Our method is implemented on SAMOA, a scalable advanced massive online analysis machine learning framework. The experimental results show its effectiveness and scalability.
  • Keywords
    data mining; learning (artificial intelligence); SAMOA framework; evolving distributed data streams; gossip-based algorithm; scalable advanced massive online analysis machine learning framework; space-saving algorithm; top-k frequent items; Data mining; Distributed databases; Machine learning algorithms; Monitoring; Peer-to-peer computing; Protocols; Radiation detectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantics, Knowledge and Grids (SKG), 2014 10th International Conference on
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/SKG.2014.18
  • Filename
    6964679