Title :
Protein secondary structure prediction using support vector machine with a PSSM profile and an advanced tertiary classifier
Author :
Hu, Hae-Jin ; Tai, Phang C. ; He, Jieyue ; Harrison, Robert ; Pan, Yi
Author_Institution :
Dept. of Comput. Sci., Georgia State Univ., Atlanta, GA, USA
Abstract :
In this study, the support vector machine (SVM) is applied as a learning machine for the secondary structure prediction. As an encoding scheme for training the SVM, position-specific scoring matrix (PSSM) is adopted. To improve the prediction accuracy, three optimization processes such as encoding scheme, sliding window size and parameter optimization are performed. For the multi-class classification, the results of three one-versus-one binary classifiers (H/E, E/C and C/H) are combined using our new tertiary classifier called SVM_Represent. By applying this new tertiary classifier, the Q3 prediction accuracy reaches 89.6% on the RSI 26 dataset and 90.1% on the CB513 dataset. Also the Segment Overlap Measure (SOV) is 85.0% on the RS 126 dataset and 85.7% on the CB513 dataset. Compared with the existing best prediction methods, our new prediction algorithm improves the accuracy about 13%) in terms of Q3 and SOV, the two most commonly used accuracy measures.
Keywords :
biochemistry; biology computing; molecular biophysics; optimisation; parameter estimation; proteins; support vector machines; CB513 dataset; PSSM profile; RS 126 dataset; SOV; SVM; advanced tertiary classifier; binary classifiers; encoding scheme; learning machine; optimization process; parameter optimization; position-specific scoring matrix; prediction accuracy; prediction algorithm; prediction method; protein secondary structure prediction; segment overlap measure; sliding window size; support vector machine; Accuracy; Computer science; Encoding; Helium; Kernel; Machine learning; Proteins; Support vector machine classification; Support vector machines; Testing;
Conference_Titel :
Computational Systems Bioinformatics Conference, 2005. Workshops and Poster Abstracts. IEEE
Print_ISBN :
0-7695-2442-7
DOI :
10.1109/CSBW.2005.114