DocumentCode :
2454598
Title :
Non-Alignment Features Based Enzyme/Non-Enzyme Classification Using an Ensemble Method
Author :
Davidson, Nicholas J. ; Wang, Xueyi
Author_Institution :
Dept. of Math., Boise State Univ., Boise, ID, USA
fYear :
2010
fDate :
12-14 Dec. 2010
Firstpage :
546
Lastpage :
551
Abstract :
As a growing number of protein structures are resolved without known functions, using computational methods to help predict protein functions from the structures becomes more and more important. Some computational methods predict protein functions by aligning to homologous proteins with known functions, but they fail to work if such homology cannot be identified. In this paper we classify enzymes/non-enzymes using non-alignment features. We propose a new ensemble method that includes three support vector machines (SVM) and two k-nearest neighbor algorithms (k-NN) and uses a simple majority voting rule. The test on a data set of 697 enzymes and 480 non-enzymes adapted from Dobson and Doig shows 85.59% accuracy in a 10-fold cross validation and 86.49% accuracy in a leave-one-out validation. The prediction accuracy is much better than other non-alignment features based methods and even slightly better than alignment features based methods. To our knowledge, our method is the first time to use ensemble methods to classify enzymes/non-enzymes and is superior over a single classifier.
Keywords :
biology computing; enzymes; feature extraction; pattern classification; prediction theory; support vector machines; computational method; ensemble method; homologous protein functions prediction; k-nearest neighbor algorithms; nonalignment features based enzyme classification; protein structure; support vector machines; Accuracy; Kernel; Magnesium; Proteins; Support vector machine classification; ensemble methods; enzyme/non-enzyme classification; k-nearest neighbour algorithm; support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4244-9211-4
Type :
conf
DOI :
10.1109/ICMLA.2010.167
Filename :
5708884
Link To Document :
بازگشت