• DocumentCode
    8366
  • Title

    Prediction of hot spots in protein interfaces using extreme learning machines with the information of spatial neighbour residues

  • Author

    Lin Wang ; Wenjuan Zhang ; Qiang Gao ; Congcong Xiong

  • Author_Institution
    Sch. of Comput. Sci. & Inf. Eng., Tianjin Univ. of Sci. & Technol., Tianjin, China
  • Volume
    8
  • Issue
    4
  • fYear
    2014
  • fDate
    8 2014
  • Firstpage
    184
  • Lastpage
    190
  • Abstract
    The identification of hot spots, a small subset of protein interfaces that accounts for the majority of binding free energy, is becoming increasingly important for the research on protein-protein interaction and drug design. For each interface residue or target residue to be predicted, the authors extract hybrid features which incorporate a wide range of information of the target residue and its spatial neighbor residues, that is, the nearest contact residue in the other face (mirror-contact residue) and the nearest contact residue in the same face (intra-contact residue). Here, feature selection is performed using random forests to avoid over-fitting. Thereafter, the extreme learning machine is employed to effectively integrate these hybrid features for predicting hot spots in protein interfaces. By the 5-fold cross validation in the training set, their method can achieve accuracy (ACC) of 82.1% and Matthew´s correlation coefficient (MCC) of 0.459, and outperforms some alternative machine learning methods in the comparison study. Furthermore, their method achieves ACC of 76.8% and MCC of 0.401 in the independent test set, and is more effective than the major existing hot spot predictors. Their prediction method offers a powerful tool for uncovering candidate residues in the studies of alanine scanning mutagenesis for functional protein interaction sites.
  • Keywords
    biochemistry; bioinformatics; correlation methods; drugs; feature selection; free energy; learning (artificial intelligence); molecular biophysics; proteins; 5-fold cross validation; alanine scanning energetics database; alanine scanning mutagenesis; alternative machine learning methods; binding free energy; binding interface database; correlation coefflcient; drug design; extreme learning machines; feature selection; functional protein interaction sites; hot spot prediction; independent test set; interface residue; nearest contact residue; protein interfaces; protein-protein interaction; random forests; spatial neighbor residues; spatial neighbour residue information;
  • fLanguage
    English
  • Journal_Title
    Systems Biology, IET
  • Publisher
    iet
  • ISSN
    1751-8849
  • Type

    jour

  • DOI
    10.1049/iet-syb.2013.0049
  • Filename
    6869324