Title :
Topology Improves Phylogenetic Motif Functional Site Predictions
Author :
KC, Dukka B. ; Livesay, Dennis R.
Author_Institution :
Dept. of Bioinf. & Genomics, Univ. of North Carolina at Charlotte, Charlotte, NC, USA
Abstract :
Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that of the complete phylogeny. Herein, we bypass the phylogenetic reconstruction step and identify PMs directly from distance matrix comparisons. In order to optimize the new algorithm, we consider three different distance matrices and 13 different matrix similarity scores. We assess the performance of the various approaches on a structurally nonredundant data set that includes three types of functional site definitions. Without exception, the predictive power of the original approach outperforms the distance matrix variants. While the distance matrix methods fail to improve upon the original approach, our results are important because they clearly demonstrate that the improved predictive power is based on the topological comparisons. Meaning that phylogenetic trees are a straightforward, yet powerful way to improve functional site prediction accuracy. While complementary studies have shown that topology improves predictions of protein-protein interactions, this report represents the first demonstration that trees improve functional site predictions as well.
Keywords :
bioinformatics; molecular biophysics; proteins; topology; alignment fragments; bioinformatics problem; distance matrix comparisons; distance matrix variants; phylogenetic motif functional site predictions; protein functional sites; protein-protein interactions; sequence-derived data; topology; tree topologies; Accuracy; Biochemistry; Bioinformatics; Genetic mutations; Ontologies; Partitioning algorithms; Phylogeny; Proteins; Topology; Phylogenetic motif; distance matrix.; functional site prediction; phylogenetic tree; Algorithms; Area Under Curve; Binding Sites; Computational Biology; Models, Genetic; Models, Statistical; Phylogeny; Protein Interaction Domains and Motifs; Proteins; Sequence Alignment; Sequence Analysis, Protein; Software;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2009.60