Title :
New Descriptors of Evolutionary Information for Accurate Prediction of DNA and RNA-Binding Residues in Protein Sequences
Author :
Wang, Liangjiang ; Huang, Caiyan
Author_Institution :
Dept. of Genetics & Biochem., Clemson Univ., Clemson, SC, USA
Abstract :
Evolutionary information in terms of position-specific scoring matrix (PSSM) has been often used to construct classifiers for biological sequence analyses. However, PSSM is rather designed for PSI-BLAST searches, and it may not contain all the evolutionary information for modeling specific sequence patterns. In this study, several new descriptors of evolutionary information have been developed and evaluated for sequence-based prediction of DNA and RNA-binding residues using support vector machines (SVMs). The new descriptors were shown to improve classifier performance. Interestingly, the best classifiers were obtained by combining the new descriptors and PSSM, suggesting that they captured different aspects of evolutionary information for DNA and RNA-binding site prediction. The SVM classifiers achieved 77.3% sensitivity and 79.3% specificity for prediction of DNA-binding residues, and 71.6% sensitivity and 78.7% specificity for RNA-binding site prediction. Predictions at this level of accuracy may provide useful information for protein functional annotation, protein-nucleic acid docking and experimental studies such as site-directed mutagenesis.
Keywords :
DNA; bioinformatics; proteomics; support vector machines; DNA binding residues; RNA binding residues; biological sequence analysis; evolutionary information descriptors; mutagenesis; position specific scoring matrix; protein sequences; support vector machines; Amino acids; Artificial neural networks; Biological information theory; Biological system modeling; DNA; Encoding; Proteins; Sequences; Support vector machine classification; Support vector machines; DNA or RNA-binding site prediction; evolutionary information descriptors; feature extraction; machine learning; selection; support vector machines;
Conference_Titel :
Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS '09. International Joint Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3739-9
DOI :
10.1109/IJCBS.2009.102