• DocumentCode
    553169
  • Title

    A new modeling strategy for eukaryotic promoter recognition and prediction

  • Author

    Shuanhu Wu ; Wenyan Zhang ; Qicheng Liu ; Yibin Song ; Chuangcun Wang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Yantai Univ., Yantai, China
  • Volume
    3
  • fYear
    2011
  • fDate
    26-28 July 2011
  • Firstpage
    1565
  • Lastpage
    1569
  • Abstract
    In this paper, we present a new modeling strategy for the recognition and prediction of promoter region. In our model, we base on following considerations: (1) promoter region comprises a number of binding sites (consensus sequences) that RNA polymerase II can bind to and start the transcription of gene, different promoter can be determined by a combination of different binding sites; (2) the spacing of these binding sites is not always consistent and there is some nucleotide variation in some position in different genes and species. Based on above considerations, we first split promoter region into equal intervals and calculate the occurring probability for each words that is assumed to be the sequences of binding sites in each interval by training sets respectively. Here we combined those interval probabilities into one matrix and refer it to as Interval Position Weight Matrix (IPWM); then a new promoter modeling strategy and feature abstracting method are introduced based on maximal probability model and IPWM. The results of testing on large genomic sequences and comparisons with several currently famous algorithms show that our algorithm is efficient with higher sensitivity and specificity.
  • Keywords
    biochemistry; enzymes; genetics; genomics; macromolecules; matrix algebra; molecular biophysics; probability; Eukaryotic promoter region prediction; Eukaryotic promoter region recognition; IPWM; RNA polymerase; binding site; feature abstracting method; genomic sequence; interval position weight matrix; interval probability; maximal probability model; nucleotide variation; promoter modeling strategy; training set; word probability; Algorithm design and analysis; Bioinformatics; DNA; Genomics; Humans; Prediction algorithms; Training; eukaryotic biology; gene recognition; promoter prediction; promoter region modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-61284-180-9
  • Type

    conf

  • DOI
    10.1109/FSKD.2011.6019817
  • Filename
    6019817