Title :
Prediction of Protein Structural Class Using PSI-BLAST Profile Based Collocation of Amino Acid Pairs
Author :
Chen, Ke ; Kurgan, Lukasz ; Ruan, Jishou
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Alberta, Edmonton, AB
Abstract :
Knowledge of structural classes is useful in understanding folding patterns in proteins. Numerous structural class prediction methods were proposed in the past. Although virtually all state-of-the-art classifiers were already tried, many of these methods use very simple protein sequence representation that often includes amino acid (AA) composition. To this end, we propose a novel sequence representation, which is based on PSI-BLAST profile based collocation of AA pairs. We used two benchmark datasets constructed by Zhou (J. of Prot Chem. 1998, 17(8):729-38) to test the proposed representation with five representative classifiers. The two best classifiers, which include a support vector machine and an instance base learner, achieved 88% and 96% accuracy on the two datasets, respectively. Our results were compared with five recently proposed methods. The comparison shows superiority of the proposed method, which reduces the error rates by 30% and 21% on the two datasets when compared with the best-performing ensemble of boosted logistic regression classifier. Finally, the new sequence representation is compared with AA composition when using support vector machine classifier. The error rate reduction due to application of the new representation equals 40% and 25% for the two datasets, respectively. In short, the PSI- BLAST profile based collocation of AA pairs is shown to be a promising feature-based sequence representation.
Keywords :
biology computing; molecular biophysics; proteins; PSI-BLAST profile-based collocation; amino acid pairs; folding patterns; protein sequence representation; protein structural class; Amino acids; Databases; Error analysis; Mathematics; Partial response channels; Protein engineering; Protein sequence; Solvents; Support vector machine classification; Support vector machines;
Conference_Titel :
Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007. The 1st International Conference on
Conference_Location :
Wuhan
Print_ISBN :
1-4244-1120-3
DOI :
10.1109/ICBBE.2007.8