• DocumentCode
    2961608
  • Title

    Active learning for the prediction of phosphorylation sites

  • Author

    Jun Jiang ; Ip, Horace H. S.

  • Author_Institution
    Image Comput. Group, City Univ. of Hong Kong, Hong Kong
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    3158
  • Lastpage
    3165
  • Abstract
    In this paper, we propose several active learning strategies to train classifiers for phosphorylation site prediction. When combined with support vector machine, we show that active learning with SVM is able to produce classifiers that give comparable or better phosphorylation site prediction performance than conventional SVM techniques and, at the same time, require a significantly less number of annotated protein training samples. The result has both conceptual and practical implications in protein prediction: it exploits information inherent in the large scale database of non-annotated protein samples and reduces the amount of manual labor required for protein annotation. To the best of our knowledge, active learning has not been explored in phosphorylation sites prediction. Several active learning strategies: single-running mode, batch-running mode with sample and support vector diversity, were investigated for phosphorylation sites prediction in this work. Our experiments have shown that active learning with SVM is able to reduce the effort of protein annotation by 6.6% to 25.7% to yield similar prediction performance as compared with conventional SVM technique.
  • Keywords
    biochemistry; bioinformatics; learning (artificial intelligence); pattern classification; proteins; support vector machines; SVM; active learning; annotated protein training sample; batch-running mode; large scale database; pattern classification; phosphorylation site prediction; single-running mode; support vector machine; Application software; Biochemistry; Computer science; Databases; Large-scale systems; Machine learning; Neural networks; Protein engineering; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634245
  • Filename
    4634245