DocumentCode
2748273
Title
A Statistical Content-based Method for Promoter Prediction Based on Sequence Segments with Wildcards
Author
Zhu, Hongmei ; Wang, Jiaxin ; Zhao, Yannan ; Yang, Zehong
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
Volume
2
fYear
0
fDate
0-0 0
Firstpage
9713
Lastpage
9717
Abstract
A certain kind of context features, sequence segments with wildcards (called IUPAC words), was used for eukaryotic polymerase II promoter prediction successfully by the content-based promoter prediction system PromoterInspector. Inspired by this, a new statistical method for promoter prediction is designed based on these context features, which also incorporates some techniques used in the well-known interpolated Markov chains (IMC) model. As tested on the vertebrate promoter dataset collected by Martin Reese, the performance of this new method is much better than that of PromoterInspector, and is comparative to that of the popular content-based IMC predictors. Combination of this statistical IUPAC-based method with IMC, results in some improvement in discriminative power as compared with the later
Keywords
DNA; Markov processes; biology computing; feature extraction; molecular biophysics; DNA; PromoterInspector system; context features; eukaryotic polymerase II promoter prediction; interpolated Markov chains; sequence segments; statistical content-based method; vertebrate promoter dataset; wildcards; Biology computing; Computational biology; Computer science; DNA; Intelligent systems; Laboratories; Polymers; RNA; Sequences; Testing; IUPAC words; Interpolated Markov chains; promoter prediction;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on
Conference_Location
Dalian
Print_ISBN
1-4244-0332-4
Type
conf
DOI
10.1109/WCICA.2006.1713889
Filename
1713889
Link To Document