• DocumentCode
    3265041
  • Title

    Protein-Protein Interaction Prediction Based on Sequence Data by Support Vector Machine with Probability Assignment

  • Author

    Ye, Jiankuan ; Kulikowski, Casimir ; Muchnik, Ilya

  • Author_Institution
    Department of Computer Science, Rutgers University, Piscataway, NJ, 08854, USA, Email: jiye@cs.rutgers.edu
  • fYear
    2005
  • fDate
    14-15 Nov. 2005
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    In this paper, we investigate the sequence-based protein-protein interaction prediction by machine learning methods. Specifically, we propose to build classifiers in the space of domain pairs, which are purely based on sequence data. We designed a novel way to select negative samples using a classification-based iterative voting procedure, and systematically compared the effects of negative sample selection on the performance of classification. We also propose an approach to estimate the probabilities for the predictions by SVM. Based on the selected negative samples, we compared nonlinear SVM based on gaussian kernel, linear SVM and linear logistic regression for both classification performance and probability assignments. Our results show that the probability assigned by SVM is more natural than logistic regression, and SVM also outperforms logistic regression for prediction.
  • Keywords
    Computer science; Kernel; Learning systems; Logistics; Machine learning; Proteins; Sequences; Support vector machine classification; Support vector machines; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Bioinformatics and Computational Biology, 2005. CIBCB '05. Proceedings of the 2005 IEEE Symposium on
  • Print_ISBN
    0-7803-9387-2
  • Type

    conf

  • DOI
    10.1109/CIBCB.2005.1594935
  • Filename
    1594935