Title :
Toward a systematic definition of protein function that scales to the genome level: defining function in terms of interactions
Author :
Lan, Ning ; Jansen, Ronald ; Gerstein, Mark
Author_Institution :
Dept. of Molecular Biophys. & Biochem., Yale Univ., New Haven, CT, USA
fDate :
12/1/2002 12:00:00 AM
Abstract :
The ultimate goal of functional genomics is to elucidate the function of all the genes in the genome. However the current notions of function are crafted for individual proteins. The degree to which they can scale to the genomic level is not clear In this paper we review the diverse approaches to functional classification, focusing on their ability to meet this challenge of scale. Our review emphasizes a number of key parameters of the systems: their accuracy, comprehensiveness, level of standardization, flexibility, and support for data mining. We then propose an approach that synthesizes a number of the promising features of the existing systems. Our approach, which we call a function grid, is based on the notion of defining a protein´s function through molecular interactions-specifically, in terms of its probability of interaction with various ligands, the list of which can be expanded infinitely. To illustrate how our function grid can be used in genome-wide prediction of function, we construct a grid of yeast genes; combine it with other genomic information, including sequence features, structure, subcellular localization, and messenger ribonucleic acid expression; and then use decision trees and support vector machines to predict deoxyribonucleic acid binding.
Keywords :
decision trees; genetics; learning automata; proteins; reviews; deoxyribonucleic acid binding prediction; gene interactions; genome-wide function prediction; ligands interaction probabilities; messenger ribonucleic acid expression; molecular interactions; ontology; protein function; proteome; scaling to genome level; subcellular localization; support vector machines; system key parameters; systematic definition; yeast genes grid; Bioinformatics; Cells (biology); DNA; Fungi; Genomics; Molecular biophysics; Organisms; Proteins; RNA; Sequences;
Journal_Title :
Proceedings of the IEEE
DOI :
10.1109/JPROC.2002.805302