Title :
An Improved CURD Clustering Algorithm Based on Quotient Space
Author :
Zhao Xiaomin ; Lu Bin
Author_Institution :
Sch. of Comput. Sci. & Technol., North China Electr. Power Univ., Baoding, China
Abstract :
As the data size increases, the efficiency of algorithm and the clustering quality draw more attraction. CURD (clustering using references and density) is a fast clustering algorithm based on reference and density, which can discover clusters with arbitrary shape and has the linear times complexity. However, it still has some shortcomings such as: the efficiency to deal with the high-dimensional data is uncertain, the noise processing is not ideal, besides the number of the clustering results may not satisfy the requirement of the users. According to these deficiencies, this paper introduces a new method to propose the high-dimensional data with information entropy technology and quotient space theory. Additionally it disposes the noise date in two stages. Finally, some improvement are given on the step of sorting the reference points by quotient space theory to produce multi-level clustering results so as to meet the different needs of customers. Experiments show that the improved algorithm not only improves the quality of the clustering algorithm but also maintains the high efficiency.
Keywords :
data mining; entropy; pattern clustering; CURD clustering algorithm; high-dimensional data; information entropy technology; quotient space theory; Clustering algorithms; Computer science; Data mining; Databases; Information entropy; Multi-stage noise shaping; Noise shaping; Shape; Sorting; Space technology;
Conference_Titel :
Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4507-3
Electronic_ISBN :
978-1-4244-4507-3
DOI :
10.1109/CISE.2009.5364188