DocumentCode
478688
Title
Analysis of binary feature mapping rules for promoter recognition in imbalanced DNA sequence datasets using Support Vector Machine
Author
Damasevicius, R.
Author_Institution
Software Eng. Dept., Kaunas Univ. of Technol., Kaunas
Volume
2
fYear
2008
fDate
6-8 Sept. 2008
Firstpage
42694
Lastpage
42699
Abstract
Recognition of specific functionally-important DNA sequence fragments is considered one of the most important problems in bioinformatics. One type of such fragments are promoters, i.e., short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. In this paper, a machine learning method, called support vector machine (SVM), is used for classification of DNA sequences and promoter recognition. For optimal classification, 11 rules for mapping of DNA sequences into binary SVM feature space are analyzed. Classification is performed using a power series kernel function. Kernel parameters are optimized using a modification of the Nelder-Mead (downhill simplex) optimization method. The results of classification for drosophila and human sequence datasets are presented.
Keywords
DNA; bioinformatics; data mining; feature extraction; genetics; learning (artificial intelligence); molecular biophysics; optimisation; pattern classification; sequences; support vector machines; Nelder-Mead optimization method; binary SVM feature mapping rule analysis; bioinformatics; biomolecular data mining; downhill simplex method; gene prediction; kernel parameter optimization; machine learning method; power series kernel function; promoter recognition; short regulatory DNA sequence dataset classification; support vector machine; Bioinformatics; Biological cells; DNA; Encoding; Kernel; Learning systems; Sequences; Support vector machine classification; Support vector machines; Training data; DNA mapping rules; Support Vector Machine; bioinformatics; data mining; promoter recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems, 2008. IS '08. 4th International IEEE Conference
Conference_Location
Varna
Print_ISBN
978-1-4244-1739-1
Electronic_ISBN
978-1-4244-1740-7
Type
conf
DOI
10.1109/IS.2008.4670503
Filename
4670503
Link To Document