Title :
Transmembrane segments prediction with support vector machine based on high performance encoding schemes
Author :
Hu, Hae-Jin ; Harrison, Robert ; Tai, Phang C. ; Pan, Yi
Author_Institution :
Dept. of Comput. Sci., Georgia State Univ., Atlanta, GA, USA
Abstract :
A new prediction scheme of transmembrane segments (TM) was developed based on the support vector machine (SVM). To apply this SVM for prediction more efficiently, three optimization processes were performed: encoding scheme, sliding window size and parameter optimization. From the encoding scheme optimization, position-specific scoring matrix (PSSM) encoding scheme is proved to be the most informative one and the prediction accuracy (Q2) with this scheme attained up to 92%. Based on the performance comparison with previous studies, this PSSM encoding scheme demonstrates the highest prediction accuracy among the common prediction methods, and the accuracy improvement is more than 13%. To verify this scheme, the blind test was done with E.coli SecE and E.coli SecY transmembrane proteins, and the result shows a decent match with the SwissProt database information and the TopPred results. However, another blind test result with five SecA proteins leaves room for discussion since it shows about 8-9 residues long TM segments for all five proteins.
Keywords :
biology computing; biomembranes; learning (artificial intelligence); microorganisms; optimisation; proteins; support vector machines; E.coli SecE transmembrane protein; E.coli SecY transmembrane protein; SecA proteins; SwissProt database information; TopPred results; blind test result; orthogonal matrix; parameter optimization; position-specific scoring matrix encoding scheme; sliding window size; support vector machine; transmembrane segments prediction; Accuracy; Amino acids; Biology; Biomembranes; Computer science; Databases; Encoding; Proteins; Support vector machines; Testing;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2004. CIBCB '04. Proceedings of the 2004 IEEE Symposium on
Print_ISBN :
0-7803-8728-7
DOI :
10.1109/CIBCB.2004.1393945