• DocumentCode
    3591082
  • Title

    A software tool to explore the structure of high dimensional biomolecular data

  • Author

    Pavesi, Giulio ; Zambelli, Federico ; R?¨, Matteo ; Valentini, G.

  • Author_Institution
    Dept. of Biomol. Sci. & Biotechnol., Univ. of Milan, Milan, Italy
  • fYear
    2010
  • Firstpage
    1070
  • Lastpage
    1074
  • Abstract
    In gene expression data analysis, several methods based on the concept of stability have been proposed to estimate the reliability of each individual expression gene cluster as well as the “optimal” number of clusters. In this conceptual framework a clustering ensemble is obtained through bootstrapping techniques, noise injection into the data or random projections into lower dimensional subspaces. A measure of the reliability of a given clustering is obtained through specific stability/reliability scores based on the similarity of the clusterings composing the ensemble. In this paper we present a software tool for detecting realiable and possibly multiple structures (e.g. hierarchical structures) simultaneously present in the data. Statistical approaches based on the chi-square distribution and on the Bernstein inequality, show that stability-based methods can be successfully applied to the statistical assessment of the reliability of clusters, and to discover multiple structures underlying complex bio-molecular data.
  • Keywords
    Bioinformatics; Clustering algorithms; Data analysis; Gene expression; Maintenance; Packaging; Reproducibility of results; Software tools; Stability; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical Engineering/Electronics Computer Telecommunications and Information Technology (ECTI-CON), 2010 International Conference on
  • Print_ISBN
    978-1-4244-5606-2
  • Electronic_ISBN
    978-1-4244-5607-9
  • Type

    conf

  • Filename
    5491640