DocumentCode :
464311
Title :
Real Value Solvent Accessibility Prediction using Adaptive Support Vector Regression
Author :
Gubbi, Jayavardhana ; Shilton, Alistair ; Palaniswami, M. ; Parker, Michael
Author_Institution :
Dept. of Electr. & Electron. Eng., Melbourne Univ., Parkville, Vic.
fYear :
2007
fDate :
1-5 April 2007
Firstpage :
395
Lastpage :
401
Abstract :
Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in prediction of fold, and eventually the tertiary structure of the protein. This paper deals with prediction of relative solvent accessibility, given only the amino-acid sequence. In this paper, we use an improved support vector regression (SVR) and new kernels for real valued prediction of solvent accessibility. In this regard, two main issues are addressed. First we address the problem of e selection, which we found to be somewhat problematic in our earlier work (e is a parameter with significant influence on noise insensitivity and generalization of SVRs). In particular, rather than employ the standard trial and error based approach, we used an improved tube shrinking method to find e. Secondly, a novel kernel combining solvation model, electrostatic charge model and evolutionary information in the form of position specific scoring matrix (PSSM) is given. A new dataset of 472 proteins with less than 20% sequence identity is curated and used to evaluate the result. To make a more objective comparison with earlier methods, we use a standard dataset and show that the proposed scheme is better than the ones normally used in literature. We also report a lowest mean absolute error (MAE) so far of 0.12 on the standard dataset.
Keywords :
biology computing; proteins; regression analysis; support vector machines; adaptive support vector regression; electrostatic charge model; evolutionary information; position specific scoring matrix; protein secondary structure; protein tertiary structure; real value solvent accessibility prediction; solvation model; Feedforward neural networks; Feedforward systems; Kernel; Multi-layer neural network; Neural networks; Proteins; Sequences; Solvents; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0710-9
Type :
conf
DOI :
10.1109/CIBCB.2007.4221249
Filename :
4221249
Link To Document :
بازگشت