DocumentCode
3591082
Title
A software tool to explore the structure of high dimensional biomolecular data
Author
Pavesi, Giulio ; Zambelli, Federico ; R?¨, Matteo ; Valentini, G.
Author_Institution
Dept. of Biomol. Sci. & Biotechnol., Univ. of Milan, Milan, Italy
fYear
2010
Firstpage
1070
Lastpage
1074
Abstract
In gene expression data analysis, several methods based on the concept of stability have been proposed to estimate the reliability of each individual expression gene cluster as well as the “optimal” number of clusters. In this conceptual framework a clustering ensemble is obtained through bootstrapping techniques, noise injection into the data or random projections into lower dimensional subspaces. A measure of the reliability of a given clustering is obtained through specific stability/reliability scores based on the similarity of the clusterings composing the ensemble. In this paper we present a software tool for detecting realiable and possibly multiple structures (e.g. hierarchical structures) simultaneously present in the data. Statistical approaches based on the chi-square distribution and on the Bernstein inequality, show that stability-based methods can be successfully applied to the statistical assessment of the reliability of clusters, and to discover multiple structures underlying complex bio-molecular data.
Keywords
Bioinformatics; Clustering algorithms; Data analysis; Gene expression; Maintenance; Packaging; Reproducibility of results; Software tools; Stability; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical Engineering/Electronics Computer Telecommunications and Information Technology (ECTI-CON), 2010 International Conference on
Print_ISBN
978-1-4244-5606-2
Electronic_ISBN
978-1-4244-5607-9
Type
conf
Filename
5491640
Link To Document