• DocumentCode
    589210
  • Title

    Prediction of Protein Essentiality by the Support Vector Machine with Statistical Tests

  • Author

    Chiou-Yi Hor ; Chang-Biau Yang ; Zih-Jie Yang ; Chiou-Ting Tseng

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Nat. Sun Yat-sen Univ., Kaohsiung, Taiwan
  • Volume
    1
  • fYear
    2012
  • fDate
    12-15 Dec. 2012
  • Firstpage
    96
  • Lastpage
    101
  • Abstract
    Essential proteins affect the cellular life deeply, but it is extreme time-consuming and labor-intensive to discriminate them experimentally. The goal of this paper is to identify the features which are crucial for discriminating protein essentiality and build learning machines for prediction. We first collect features from a variety of sources. Then we adopt a backward feature selection method and use the selected features to build SVM predictors. The cross validations are conducted on the originally imbalanced data set as well as the down-sampling balanced data set. The performance of these feature subsets are then subject to the statistical test to confirm their significance. For the imbalanced data set, our best values of F-measure and MCC are 0.549 and 0.495, respectively. For balanced data set, our best values of F-measure and MCC of our models are 0.770 and 0.545, respectively. The results are superior to all previous results under various performance measures.
  • Keywords
    biology computing; learning (artificial intelligence); proteins; statistical analysis; support vector machines; F-measure; SVM predictors; backward feature selection method; balanced data set down-sampling; cellular life; learning machines; protein discrimination; protein essentiality prediction; statistical tests; support vector machine; Amino acids; Data models; Feature extraction; Indexes; Machine learning; Proteins; Support vector machines; bioinformatics; essential protein; protein-protein interaction; statistical test; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2012 11th International Conference on
  • Conference_Location
    Boca Raton, FL
  • Print_ISBN
    978-1-4673-4651-1
  • Type

    conf

  • DOI
    10.1109/ICMLA.2012.25
  • Filename
    6406595