DocumentCode
2909968
Title
HDG-Tree: A Structure for Clustering High-dimensional Data Streams
Author
Ren, Jiadong ; Li, Lining ; Xia, Yan
Author_Institution
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
Volume
2
fYear
2009
fDate
21-22 Nov. 2009
Firstpage
594
Lastpage
597
Abstract
Clustering data stream is a challenging work due to the limited memories and a single pass. In this paper, a new grid based algorithm for clustering high-dimensional data stream (called GHStream) is proposed, which adopts a two-phase clustering formwork. In the online component, a High-dimensional Dense Grid Tree (abbreviated HDG-Tree) is presented to summarize streaming data. As data streams evolve, the HDG-Tree is dynamic updated. In the offline component, when a clustering request is advanced by users, the grid cells stored in HDG-Tree is marked different clusterID to generate the final cluster results. The experimental results on real and synthetic datasets demonstrate that GHStream has higher clustering quality and better scalability.
Keywords
pattern clustering; tree data structures; GHStream algorithm; HDG-Tree algorithm; grid based algorithm; high dimensional data streams clustering; high dimensional dense grid tree; Clustering algorithms; Costs; Data analysis; Information technology; Intelligent structures; Lattices; Mesh generation; Monitoring; Partitioning algorithms; Space technology; clustering; data streams; high dimension;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on
Conference_Location
Nanchang
Print_ISBN
978-0-7695-3859-4
Type
conf
DOI
10.1109/IITA.2009.370
Filename
5369007
Link To Document