• DocumentCode
    2851244
  • Title

    Revealing true subspace clusters in high dimensions

  • Author

    Liu, Jinze ; Strohmaier, Karl ; Wang, Wei

  • Author_Institution
    Dept. of Comput. Sci., North Carolina Univ., Chapel Hill, NC, USA
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    463
  • Lastpage
    466
  • Abstract
    Subspace clustering is one of the best approaches for discovering meaningful clusters in high dimensional space. One cluster in high dimensional space may be transcribed into multiple distinct maximal clusters by projecting onto different subspaces. A direct consequence of clustering independently in each subspace is an overwhelmingly large set of overlapping clusters which may be significantly similar. To reveal the true underlying clusters, we propose a similarity measurement of the overlapping clusters. We adopt the model of Gaussian tailed hyper-rectangles to capture the distribution of any subspace cluster. A set of experiments on a synthetic dataset demonstrates the effectiveness of our approach. Application to real gene expression data also reveals impressive meta-clusters expected by biologists.
  • Keywords
    Gaussian processes; pattern clustering; statistical analysis; Gaussian tailed hyperrectangles; cluster intersection; gene expression; high dimensional space; overlapping cluster; similarity measurement; subspace clustering; Adhesives; Algorithm design and analysis; Biological system modeling; Clustering algorithms; Computer science; Data mining; Entropy; Gene expression; Merging; Tail; Adhesion; Cluster Intersection; Gaussian Tails; Gene Expression; Local Grid; Overlapping Cluster; Subspace Clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10034
  • Filename
    1410336