Title :
Super Granular Shrink-SVM Feature Elimination (Super GS-SVM-FE) Model for Protein Sequence Motif Information Extraction
Author :
Chen, Bernard ; Pellicer, Stephen ; Tai, Phang C. ; Harrison, Robert ; Pan, Yi
Author_Institution :
Georgia State Univ., Atlanta
Abstract :
Protein sequence motifs are gathering more and more attention in the sequence analysis area. These recurring regions have the potential to determine protein ´s conformation, function and activities. In our previous work, we tried to obtain protein sequence motifs which are universally conserved across protein family boundaries. Therefore, unlike most popular motif discovering algorithms, our input dataset is extremely large. In order to deal with large input datasets, we provided two granular computing models (FIK and FGK model) to efficiently generate protein motifs information and Super GSVM-FE model to do the feature elimination for improving the quality of motif information. In this article, we tried to further improve our SVM feature elimination model to achieve three goals: Reduce time execution by half, further improve motif information quality and add the ability of adjusting the number of filtered segments. Compared with the latest results, our new approach shows great improvements.
Keywords :
biology computing; macromolecules; proteins; support vector machines; conformation; elimination model; filtered segments; fuzzy-Greedy-Kmeans model; granular computing models; information extraction; protein sequence motifs; super granular shrink-SVM feature elimination model; Biological system modeling; Biology; Clustering algorithms; Computer science; Data mining; Information filtering; Information filters; Protein sequence; Sequences; Space technology;
Conference_Titel :
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-1509-0
DOI :
10.1109/BIBE.2007.4375591