• DocumentCode
    445808
  • Title

    Random projections for assessing gene expression cluster stability

  • Author

    Bertoni, Alberto ; Valentini, Giorgio

  • Author_Institution
    Dipt. di Sci. dell´´ Informazione, Universita degli Studi di Milano, Italy
  • Volume
    1
  • fYear
    2005
  • fDate
    31 July-4 Aug. 2005
  • Firstpage
    149
  • Abstract
    Clustering analysis of gene expression is characterized by the very high dimensionality and low cardinality of the data, and two important related topics are the validation and the estimate of the number of the obtained clusters. In this paper we focus on the estimate of the stability of the clusters. Our approach to this problem is based on random projections obeying the Johnson-Lindenstrauss lemma, by which gene expression data may be projected into randomly selected low dimensional suhspaces, approximately preserving pairwise distances between examples. We experiment with different types of random projections, comparing empirical and theoretical distortions induced by randomized embeddings between Euclidean metric spaces, and we present cluster-stability measures that may be used to validate and to quantitatively assess the reliability of the clusters obtained by a large class of clustering algorithms. Experimental results with high dimensional synthetic and DNA microarray data show the effectiveness of the proposed approach.
  • Keywords
    DNA; biology; pattern clustering; DNA microarray data; Euclidean metric spaces; Johnson-Lindenstrauss lemma; cluster-stability measures; clustering analysis; gene expression cluster stability; high dimensional synthetic; random projections; randomized embeddings; Clustering algorithms; Clustering methods; DNA; Distortion measurement; Euclidean distance; Extraterrestrial measurements; Gene expression; Neoplasms; Reliability theory; Stability analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on
  • Print_ISBN
    0-7803-9048-2
  • Type

    conf

  • DOI
    10.1109/IJCNN.2005.1555821
  • Filename
    1555821