DocumentCode
2961608
Title
Active learning for the prediction of phosphorylation sites
Author
Jun Jiang ; Ip, Horace H. S.
Author_Institution
Image Comput. Group, City Univ. of Hong Kong, Hong Kong
fYear
2008
fDate
1-8 June 2008
Firstpage
3158
Lastpage
3165
Abstract
In this paper, we propose several active learning strategies to train classifiers for phosphorylation site prediction. When combined with support vector machine, we show that active learning with SVM is able to produce classifiers that give comparable or better phosphorylation site prediction performance than conventional SVM techniques and, at the same time, require a significantly less number of annotated protein training samples. The result has both conceptual and practical implications in protein prediction: it exploits information inherent in the large scale database of non-annotated protein samples and reduces the amount of manual labor required for protein annotation. To the best of our knowledge, active learning has not been explored in phosphorylation sites prediction. Several active learning strategies: single-running mode, batch-running mode with sample and support vector diversity, were investigated for phosphorylation sites prediction in this work. Our experiments have shown that active learning with SVM is able to reduce the effort of protein annotation by 6.6% to 25.7% to yield similar prediction performance as compared with conventional SVM technique.
Keywords
biochemistry; bioinformatics; learning (artificial intelligence); pattern classification; proteins; support vector machines; SVM; active learning; annotated protein training sample; batch-running mode; large scale database; pattern classification; phosphorylation site prediction; single-running mode; support vector machine; Application software; Biochemistry; Computer science; Databases; Large-scale systems; Machine learning; Neural networks; Protein engineering; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location
Hong Kong
ISSN
1098-7576
Print_ISBN
978-1-4244-1820-6
Electronic_ISBN
1098-7576
Type
conf
DOI
10.1109/IJCNN.2008.4634245
Filename
4634245
Link To Document