DocumentCode :
2186092
Title :
Clustering Data Streams Using Mass Estimation
Author :
Sabau, Andrei Sorin
Author_Institution :
Fac. of Math. & Comput. Sci., Univ. of Pitesti, Pitesti, Romania
fYear :
2013
fDate :
23-26 Sept. 2013
Firstpage :
289
Lastpage :
295
Abstract :
The explosive growth of data generation, storage and analysis within the last decade has led to extensive research towards stream mining algorithms. The existing stream clustering literature contains both adaptation of classical methods as well as novel ones trying to address space and time scalability issues arising from dealing with high volume, high velocity information assets. This paper presents MaStream, a novel stream clustering algorithm experiencing constant space complexity and average case sub-linear time complexity. The algorithm makes use of mass estimation as an alternative to density estimation without employing any distance measure making it highly adaptable to both low and high dimensional data streams. Employing an evolving ensemble of h:d-Trees, the algorithm identifies arbitrary shaped clusters while handling both noise and outliers without a priori information such as total number of clusters. Experimental results over a series of both synthetic and real datasets illustrate the algorithm performance.
Keywords :
computational complexity; data analysis; data mining; pattern clustering; trees (mathematics); MaStream; Mass Estimation; constant space complexity; data analysis; data generation; data storage; density estimation; h:d-trees; novel data stream clustering algorithm; stream mining algorithms; sub-linear time complexity; Algorithm design and analysis; Clustering algorithms; Data mining; Data models; Estimation; Partitioning algorithms; Vegetation; clustering ensemble; mass-based clustering; stream clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2013 15th International Symposium on
Conference_Location :
Timisoara
Print_ISBN :
978-1-4799-3035-7
Type :
conf
DOI :
10.1109/SYNASC.2013.45
Filename :
6821162
Link To Document :
بازگشت