• DocumentCode
    2774964
  • Title

    A projective clustering algorithm based on significant local dense areas

  • Author

    Zong, Yu ; Xu, Guandong ; Jin, Ping ; Yi, Xun ; Chen, Enhong ; Wu, Zongda

  • Author_Institution
    Dept. of Inf. & Eng., West Anhui Univ., China
  • fYear
    2012
  • fDate
    10-15 June 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    High dimensional clustering is often encountered in real application and projective clustering is an effective way to deal with high dimensional clustering problems aiming to capture the dense areas embedded in subsets of attributes/subspaces. Most projective clustering algorithms use equal or varying width hyper-rectangle structure to identify the dense areas and their locations. Therefore, it is a crucial task to decide the widths of these hyper-rectangle structures in projective clustering. Naturally, making use of the real data distribution directly to determine the widths of the dense structures is a promising and feasible approach. In this paper, we propose a projective clustering algorithm based on hyper-rectangle structure, whose width is estimated from the kernel distribution of real data. In particular, we first define a structure called Significant Local Dense Area (SLDA) structure by using an efficient kernel density estimator, Rodeo; and then design a greedy search method to find the whole SLDAs covered the data distribution in the high-dimensional space; eventually, we run a single-linkage clustering algorithm on the SLDAs to form the final clusters and identify the outliers. The main strength of the proposed algorithm is validated by the experiments on synthetic and real world data sets.
  • Keywords
    pattern clustering; search problems; Rodeo; SLDA; data distribution; greedy search method; high dimensional clustering problems; hyper-rectangle structure; kernel density estimator; projective clustering algorithm; significant local dense area structure; single-linkage clustering algorithm; Algorithm design and analysis; Clustering algorithms; Distributed databases; Educational institutions; Kernel; Search problems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2012 International Joint Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4673-1488-6
  • Electronic_ISBN
    2161-4393
  • Type

    conf

  • DOI
    10.1109/IJCNN.2012.6252668
  • Filename
    6252668