Title :
A new online clustering approach for data in arbitrary shaped clusters
Author :
Hyde, Richard ; Angelov, Plamen
Author_Institution :
Sch. of Comput. & Commun., Lancaster Univ., Lancaster, UK
Abstract :
In this paper we demonstrate a new density based clustering technique, CODSAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.
Keywords :
data handling; pattern clustering; CODSAS; arbitrary shaped clusters; computational efficiency; core regions; hyper-spherical microclusters; memory efficiency; microclusters; new online clustering approach; outlier microclusters; streaming data; unusual data; Accuracy; Clustering algorithms; Complexity theory; Nickel; Noise; Shape; Spirals; arbitrary shape clusters; big data; clustering; data streams; online clustering;
Conference_Titel :
Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on
Conference_Location :
Gdynia
Print_ISBN :
978-1-4799-8320-9
DOI :
10.1109/CYBConf.2015.7175937