DocumentCode :
3107226
Title :
Classification of enzyme functional classes and subclasses using support vector machine
Author :
Yadav, Sanjeev Kumar ; Bhola, Amit ; Tiwari, Arvind Kumar
Author_Institution :
Dept. of CSE, KIT, Varanasi, India
fYear :
2015
fDate :
25-27 Feb. 2015
Firstpage :
411
Lastpage :
417
Abstract :
Enzymes play an important role in metabolism that helps in catalyzing bio-chemical reactions. Predicting functions of enzymes by experiments is costly and time consuming. Hence a computational method is required to predict the function of enzymes. This paper presents a supervised machine learning approach to predict the functional classes and subclass of protein sequences including enzymes and non-enzymes based on 857 sequence derived features. This paper used seven sequence derived properties including amino acid composition, dipeptide composition, correlation feature, composition, transition, distribution and pseudo amino acid composition. We have used recursive feature elimination technique (RFE), in order to select optimal number of features. The support vector machine (SVM) has been used to construct a three level model with optimal number of features selected by SVM-RFE, where top (first) level distinguish a query protein as an enzyme or nonenzyme, the next (second) level predicts the enzyme functional class and the last (third) level predict the subfunctional class. The proposed model reported overall accuracy of 97.6%, precision of 97.8%and Matthew Correlation Coefficient (MCC) value of 0.93 for the first level, whereas accuracy of 87.3%, precision of 87.7% and MCC value of 0.84 for second level and accuracy of 85.6%, precision of 87.9% and MCC value of 0.86 for the third level.
Keywords :
bioinformatics; enzymes; feature selection; learning (artificial intelligence); organic compounds; pattern classification; support vector machines; MCC value; Matthew correlation coefficient; SVM-RFE; dipeptide composition; enzyme function classification; enzyme function prediction; enzyme functional classes; enzyme subfunctional class; feature selection; protein sequences; pseudoamino acid composition; recursive feature elimination technique; sequence derived features; supervised machine learning approach; support vector machine; Accuracy; Amino acids; Feature extraction; Predictive models; Proteins; Support vector machines; Classification; Enzyme; SVM-RFE; support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8432-9
Type :
conf
DOI :
10.1109/ABLAZE.2015.7155031
Filename :
7155031
Link To Document :
بازگشت