• DocumentCode
    2832103
  • Title

    An Improved CURD Clustering Algorithm Based on Quotient Space

  • Author

    Zhao Xiaomin ; Lu Bin

  • Author_Institution
    Sch. of Comput. Sci. & Technol., North China Electr. Power Univ., Baoding, China
  • fYear
    2009
  • fDate
    11-13 Dec. 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    As the data size increases, the efficiency of algorithm and the clustering quality draw more attraction. CURD (clustering using references and density) is a fast clustering algorithm based on reference and density, which can discover clusters with arbitrary shape and has the linear times complexity. However, it still has some shortcomings such as: the efficiency to deal with the high-dimensional data is uncertain, the noise processing is not ideal, besides the number of the clustering results may not satisfy the requirement of the users. According to these deficiencies, this paper introduces a new method to propose the high-dimensional data with information entropy technology and quotient space theory. Additionally it disposes the noise date in two stages. Finally, some improvement are given on the step of sorting the reference points by quotient space theory to produce multi-level clustering results so as to meet the different needs of customers. Experiments show that the improved algorithm not only improves the quality of the clustering algorithm but also maintains the high efficiency.
  • Keywords
    data mining; entropy; pattern clustering; CURD clustering algorithm; high-dimensional data; information entropy technology; quotient space theory; Clustering algorithms; Computer science; Data mining; Databases; Information entropy; Multi-stage noise shaping; Noise shaping; Shape; Sorting; Space technology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-4507-3
  • Electronic_ISBN
    978-1-4244-4507-3
  • Type

    conf

  • DOI
    10.1109/CISE.2009.5364188
  • Filename
    5364188