DocumentCode :
1496159
Title :
True Path Rule Hierarchical Ensembles for Genome-Wide Gene Function Prediction
Author :
Valentini, Giorgio
Author_Institution :
DSI-Dipt. di Sci. dell´´Infomazione, Univ. degli Studi di Milano, Milan, Italy
Volume :
8
Issue :
3
fYear :
2011
Firstpage :
832
Lastpage :
847
Abstract :
Gene function prediction is a complex computational problem, characterized by several items: the number of functional classes is large, and a gene may belong to multiple classes; functional classes are structured according to a hierarchy; classes are usually unbalanced, with more negative than positive examples; class labels can be uncertain and the annotations largely incomplete; to improve the predictions, multiple sources of data need to be properly integrated. In this contribution, we focus on the first three items, and, in particular, on the development of a new method for the hierarchical genome-wide and ontology-wide gene function prediction. The proposed algorithm is inspired by the “true path rule” (TPR) that governs both the Gene Ontology and FunCat taxonomies. According to this rule, the proposed TPR ensemble method is characterized by a two-way asymmetric flow of information that traverses the graph-structured ensemble: positive predictions for a node influence in a recursive way its ancestors, while negative predictions influence its offsprings. Cross-validated results with the model organism S. Crevisiae, using seven different sources of biomolecular data, and a theoretical analysis of the the TPR algorithm show the effectiveness and the drawbacks of the proposed approach.
Keywords :
bioinformatics; data analysis; genetics; genomics; macromolecules; molecular biophysics; FunCat taxonomy; biomolecular data; data sources; genome-wide gene function; graph-structured ensemble; ontology-wide gene function prediction; true path rule method; two-way asymmetric flow; Biochemistry; Bioinformatics; Biological processes; Couplings; Genomics; Ontologies; Organisms; Prediction algorithms; Prediction methods; Taxonomy; Functional Catalogue (FunCat).; Gene function prediction; ensemble methods; hierarchical classification; Algorithms; Artificial Intelligence; Databases, Genetic; Genes; Genomics; Logistic Models; Normal Distribution; Reproducibility of Results; Saccharomyces cerevisiae Proteins;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2010.38
Filename :
5467036
Link To Document :
بازگشت