DocumentCode :
2909968
Title :
HDG-Tree: A Structure for Clustering High-dimensional Data Streams
Author :
Ren, Jiadong ; Li, Lining ; Xia, Yan
Author_Institution :
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
Volume :
2
fYear :
2009
fDate :
21-22 Nov. 2009
Firstpage :
594
Lastpage :
597
Abstract :
Clustering data stream is a challenging work due to the limited memories and a single pass. In this paper, a new grid based algorithm for clustering high-dimensional data stream (called GHStream) is proposed, which adopts a two-phase clustering formwork. In the online component, a High-dimensional Dense Grid Tree (abbreviated HDG-Tree) is presented to summarize streaming data. As data streams evolve, the HDG-Tree is dynamic updated. In the offline component, when a clustering request is advanced by users, the grid cells stored in HDG-Tree is marked different clusterID to generate the final cluster results. The experimental results on real and synthetic datasets demonstrate that GHStream has higher clustering quality and better scalability.
Keywords :
pattern clustering; tree data structures; GHStream algorithm; HDG-Tree algorithm; grid based algorithm; high dimensional data streams clustering; high dimensional dense grid tree; Clustering algorithms; Costs; Data analysis; Information technology; Intelligent structures; Lattices; Mesh generation; Monitoring; Partitioning algorithms; Space technology; clustering; data streams; high dimension;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on
Conference_Location :
Nanchang
Print_ISBN :
978-0-7695-3859-4
Type :
conf
DOI :
10.1109/IITA.2009.370
Filename :
5369007
Link To Document :
بازگشت