DocumentCode
3457779
Title
A Weighted Subspace Clustering Algorithm in High-Dimensional Data Streams
Author
Ren, Jiadong ; Li, Lining ; Hu, Changzhen
Author_Institution
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
fYear
2009
fDate
7-9 Dec. 2009
Firstpage
631
Lastpage
634
Abstract
Clustering is a significant and difficult problem in data stream mining due to a mass of streaming data arriving continuously. High-dimensional data streams make clustering analysis more complex because of the sparsity of data. In this paper, we propose a new clustering method for high-dimensional data streams, called WSCStream. The method incorporates a fading cluster structure and a dimensional weight matrix. We assign a weight to each dimension of corresponding cluster in the matrix. The weight associated with each dimension indicates the importance of each dimension to the corresponding cluster. The weighted distance between a cluster and a data point is used to obtain the final clusters as the new data points arrive over time. Experimental results on real and synthetic datasets demonstrate that WSCStream has higher clustering quality than PHStream.
Keywords
data mining; pattern clustering; WSCStream; data stream mining; dimensional weight matrix; fading cluster structure; high-dimensional data streams; weighted subspace clustering algorithm; Clustering algorithms; Computerized monitoring; Data engineering; Data mining; Educational institutions; Information science; Lattices; Shape; Space technology; Weight control;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on
Conference_Location
Kaohsiung
Print_ISBN
978-1-4244-5543-0
Type
conf
DOI
10.1109/ICICIC.2009.64
Filename
5412411
Link To Document