DocumentCode :
742887
Title :
A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem
Author :
Dehzangi, A. ; Paliwal, K. ; Sharma, A. ; Dehzangi, O. ; Sattar, A.
Author_Institution :
Inst. of Integrated & Intell. Syst., Griffith Univ., Griffith, QLD, Australia
Volume :
10
Issue :
3
fYear :
2013
Firstpage :
564
Lastpage :
575
Abstract :
Better understanding of structural class of a given protein reveals important information about its overall folding type and its domain. It can also be directly used to provide critical information on general tertiary structure of a protein which has a profound impact on protein function determination and drug design. Despite tremendous enhancements made by pattern recognition-based approaches to solve this problem, it still remains as an unsolved issue for bioinformatics that demands more attention and exploration. In this study, we propose a novel feature extraction model that incorporates physicochemical and evolutionary-based information simultaneously. We also propose overlapped segmented distribution and autocorrelation-based feature extraction methods to provide more local and global discriminatory information. The proposed feature extraction methods are explored for 15 most promising attributes that are selected from a wide range of physicochemical-based attributes. Finally, by applying an ensemble of different classifiers namely, Adaboost.M1, LogitBoost, naive Bayes, multilayer perceptron (MLP), and support vector machine (SVM) we show enhancement of the protein structural class prediction accuracy for four popular benchmarks.
Keywords :
Bayes methods; bioinformatics; biological techniques; evolutionary computation; feature extraction; molecular configurations; multilayer perceptrons; proteins; proteomics; support vector machines; Adaboost.M1; LogitBoost; autocorrelation-based feature extraction methods; bioinformatics; evolutionary-based information; multilayer perceptron; naive Bayes; overlapped segmented distribution; pattern recognition; physicochemical-based features; protein structural class prediction problem; support vector machine; Mixture of feature extraction models; ensemble of different classifiers; overlapped segmented autocorrelation; overlapped segmented distribution; physicochemical-based features;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2013.65
Filename :
6520842
Link To Document :
بازگشت