• DocumentCode
    60381
  • Title

    Disulfide Connectivity Prediction Based on Modelled Protein 3D Structural Information and Random Forest Regression

  • Author

    Dong-Jun Yu ; Yang Li ; Jun Hu ; Xibei Yang ; Jing-Yu Yang ; Hong-Bin Shen

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
  • Volume
    12
  • Issue
    3
  • fYear
    2015
  • fDate
    May-June 1 2015
  • Firstpage
    611
  • Lastpage
    621
  • Abstract
    Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form discriminative features. Based on the extracted features, a random forest regression model is performed to predict protein disulfide connectivity. We compare the proposed method with popular existing predictors by performing both cross-validation and independent validation tests on benchmark datasets. The experimental results demonstrate the superiority of the proposed method over existing predictors. We believe the superiority of the proposed method benefits from both the good discriminative capability of the newly developed features and the powerful modelling capability of the random forest. The web server implementation, called TargetDisulfide, and the benchmark datasets are freely available at: http://csbio.njust.edu.cn/bioinf/TargetDisulfide for academic use.
  • Keywords
    bioinformatics; feature extraction; genomics; molecular biophysics; molecular configurations; proteins; regression analysis; TargetDisulfide; Web server; discriminative features; disulfide connectivity prediction; feature extraction; post-genome era; protein 3D structural information; protein function; protein sequence; protein structure; random forest regression; Benchmark testing; Bioinformatics; Educational institutions; Feature extraction; IEEE transactions; Proteins; Three-dimensional displays; Protein structure prediction; disulfide connectivity prediction; feature extraction; random forest; regression model;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2359451
  • Filename
    6967829