• DocumentCode
    2909968
  • Title

    HDG-Tree: A Structure for Clustering High-dimensional Data Streams

  • Author

    Ren, Jiadong ; Li, Lining ; Xia, Yan

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
  • Volume
    2
  • fYear
    2009
  • fDate
    21-22 Nov. 2009
  • Firstpage
    594
  • Lastpage
    597
  • Abstract
    Clustering data stream is a challenging work due to the limited memories and a single pass. In this paper, a new grid based algorithm for clustering high-dimensional data stream (called GHStream) is proposed, which adopts a two-phase clustering formwork. In the online component, a High-dimensional Dense Grid Tree (abbreviated HDG-Tree) is presented to summarize streaming data. As data streams evolve, the HDG-Tree is dynamic updated. In the offline component, when a clustering request is advanced by users, the grid cells stored in HDG-Tree is marked different clusterID to generate the final cluster results. The experimental results on real and synthetic datasets demonstrate that GHStream has higher clustering quality and better scalability.
  • Keywords
    pattern clustering; tree data structures; GHStream algorithm; HDG-Tree algorithm; grid based algorithm; high dimensional data streams clustering; high dimensional dense grid tree; Clustering algorithms; Costs; Data analysis; Information technology; Intelligent structures; Lattices; Mesh generation; Monitoring; Partitioning algorithms; Space technology; clustering; data streams; high dimension;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on
  • Conference_Location
    Nanchang
  • Print_ISBN
    978-0-7695-3859-4
  • Type

    conf

  • DOI
    10.1109/IITA.2009.370
  • Filename
    5369007