DocumentCode
60381
Title
Disulfide Connectivity Prediction Based on Modelled Protein 3D Structural Information and Random Forest Regression
Author
Dong-Jun Yu ; Yang Li ; Jun Hu ; Xibei Yang ; Jing-Yu Yang ; Hong-Bin Shen
Author_Institution
Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
Volume
12
Issue
3
fYear
2015
fDate
May-June 1 2015
Firstpage
611
Lastpage
621
Abstract
Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form discriminative features. Based on the extracted features, a random forest regression model is performed to predict protein disulfide connectivity. We compare the proposed method with popular existing predictors by performing both cross-validation and independent validation tests on benchmark datasets. The experimental results demonstrate the superiority of the proposed method over existing predictors. We believe the superiority of the proposed method benefits from both the good discriminative capability of the newly developed features and the powerful modelling capability of the random forest. The web server implementation, called TargetDisulfide, and the benchmark datasets are freely available at: http://csbio.njust.edu.cn/bioinf/TargetDisulfide for academic use.
Keywords
bioinformatics; feature extraction; genomics; molecular biophysics; molecular configurations; proteins; regression analysis; TargetDisulfide; Web server; discriminative features; disulfide connectivity prediction; feature extraction; post-genome era; protein 3D structural information; protein function; protein sequence; protein structure; random forest regression; Benchmark testing; Bioinformatics; Educational institutions; Feature extraction; IEEE transactions; Proteins; Three-dimensional displays; Protein structure prediction; disulfide connectivity prediction; feature extraction; random forest; regression model;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2014.2359451
Filename
6967829
Link To Document