Title :
Memory effect in DBSCAN algorithm
Author :
Li Jian ; Yu Wei ; Yan Bao-Ping
Author_Institution :
Comput. Network Inf. Center, Chinese Acad. of Sci., Beijing, China
Abstract :
As a density based clustering algorithm, DBSCAN plays an important role in data mining. Normally DBSCAN algorithm is computationally expensive, limiting its performance in large-scale data sets, especially in high dimensional data sets. The high complexity is rooted from the region queries, a very common operation in density based algorithms, which brings the complexity of the algorithms to O(n2), where n is the number of database objects. With the help of index structure the complexity can be reduced to O (nlogn), however it is inefficient to create the index structure especially for high dimensional data sets or large-scale databases. In this paper we propose a new concept named memory effect (ME). ME can be used to shrink the scope of region queries to neighboring objects. Based on ME we have improved DBSCAN algorithm evidently, and empirical experiments have shown the improvement in both effectiveness and efficiency. At last, we give the theoretical analysis of MEDBSCAN algorithm and talk about the influence of parameters.
Keywords :
computational complexity; data mining; pattern clustering; storage management; DBSCAN algorithm; MEDBSCAN algorithm; O-nlogn; data mining; density based clustering algorithm; high dimensional data set; large-scale database; memory effect; region query; Algorithm design and analysis; Clustering algorithms; Computer science; Data mining; Databases; Indexes; Large-scale systems; Partitioning algorithms; Shape; Software algorithms; DBSCAN Algorithm; Density-based clustering; MEDBSCAN Algorithm; Memory Effect;
Conference_Titel :
Computer Science & Education, 2009. ICCSE '09. 4th International Conference on
Conference_Location :
Nanning
Print_ISBN :
978-1-4244-3520-3
Electronic_ISBN :
978-1-4244-3521-0
DOI :
10.1109/ICCSE.2009.5228532