DocumentCode :
830125
Title :
Fold recognition by predicted alignment accuracy
Author :
Xu, Jinbo
Author_Institution :
Dept. of Mathematics & Comput. Sci., MIT, Cambridge, MA, USA
Volume :
2
Issue :
2
fYear :
2005
Firstpage :
157
Lastpage :
165
Abstract :
One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target sequence since the three-dimensional structure of the target sequence is built on the sequence-template alignment. The traditional method for template selection is called Z-score, which uses a statistical test to rank all the sequence-template alignments and then chooses the first-ranked template for the sequence. However, the calculation of Z-score is time-consuming and not suitable for genome-scale structure prediction. Z-scores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a support vector machine (SVM) regression approach to directly predict the alignment accuracy of a sequence-template alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a large-scale benchmark demonstrate that SVM regression performs much better than the composition-corrected Z-score method. SVM regression also runs much faster than the Z-score method.
Keywords :
biology computing; genetics; molecular biophysics; molecular configurations; prediction theory; proteins; regression analysis; support vector machines; Z-score; fold recognition; genome-scale structure prediction; optimal sequence-template alignments; predicted alignment accuracy; protein structure prediction; protein threading; support vector machine regression; Accuracy; Algorithm design and analysis; Bioinformatics; Genomics; Large-scale systems; Libraries; Proteins; Support vector machines; Target recognition; Testing; Protein structure prediction; SVM regression.; protein fold recognition; protein threading; Algorithms; Amino Acid Sequence; Binding Sites; Computer Simulation; Models, Chemical; Models, Molecular; Molecular Sequence Data; Protein Binding; Protein Folding; Proteins; Reproducibility of Results; Sensitivity and Specificity; Sequence Alignment; Sequence Analysis, Protein; Sequence Homology, Amino Acid;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2005.24
Filename :
1438352
Link To Document :
بازگشت