Title :
A SVM for GPCR Protein Prediction Using Pattern Discovery
Author :
Nascimento, F. ; Tsang, I.R. ; Cavalcanti, George D C
Author_Institution :
Center of Inf., Fed. Univ. of Pernambuco, Recife
Abstract :
Machines learning techniques have been applied in several different problems in bioinformatics. Similarly, pattern discovery algorithms have also been used to uncover hidden motifs in protein sequences, contributing greatly to the understanding of the problem of protein classification. G-protein coupled receptors (GPCRs) represent one of the largest protein families in Human Genome. Most of these receptors are major target for drug discovery and development. Therefore, they are of interest to the pharmaceutical industry. The technique used in this paper combine machine learning and pattern discovery methods to develop a protein prediction procedure in relation to its functional class, more specifically to predict GPCR protein class. Vilo[2]proposed an algorithm in order to extract pattern of regular expressions from known protein GPCR sequences and used them to predict coupling specificity of G protein coupled receptors to their G proteins. We analyze these patterns and combine them as features for feeding a SVM to predict the GPCR super class. We demonstrate the results using ROC curves, which are well-indicated to evaluate the performance of this kind of classifiers. The experiments, based on the GPCRDB database, also showed that we were able to find some novel GPCR sequences that were not described in the PROSITE database.
Keywords :
biology computing; pattern classification; proteins; support vector machines; G-protein coupled receptors; GPCR; GPCRDB database; PROSITE database; SVM; bioinformatics; human genome; machine learning; pattern discovery; pharmaceutical industry; protein prediction; Bioinformatics; Drugs; Genomics; Humans; Industrial relations; Machine learning; Pharmaceuticals; Protein engineering; Support vector machine classification; Support vector machines; GPCR; Pattern Discovery; Prediction; Proteins; SVM;
Conference_Titel :
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-0-7695-3326-1
Electronic_ISBN :
978-0-7695-3326-1
DOI :
10.1109/HIS.2008.51