• DocumentCode
    1566583
  • Title

    Integration of knowledge-discovery and artificial-intelligence approaches for promoter recognition in DNA sequences

  • Author

    Huang, Yin-Fu ; Wang, Chia-Ming

  • Author_Institution
    Graduate Sch. of Comput. Sci. & Inf. Eng., National Yunlin Univ. of Sci. & Technol., Taiwan
  • Volume
    1
  • fYear
    2005
  • Firstpage
    459
  • Abstract
    Bioinformatics nowadays is a very attractive field. Many fascinating biological problems were still unsolved, even after a great amount of diverse genomic sequences have been sequenced for the coming of post genome era. Currently available programs are far from powerful enough to recognize the regulatory signals completely. Researches have looked for various types of patterns around the transcription start site (TSS) and tried to translate those as classification rules; however, they were not always good solutions. In this paper, we proposed a new hybrid learning system to recognize the regulatory elements (i.e., promoter) in deoxyribonucleic acid (DNA) sequences. The proposed hybrid system calculated the distributions of oligo-nucleotides statistics as positional weight matrices which contribute to discriminate promoters from non-promoters. This study can help to locate the expressive regions of DNA, to foretell and to realize the properties, structures, and functions of the proteins that are synthesized starting from the coding region of DNA. The benchmark datasets were evaluated using the leave-one-out method. The experimental results demonstrate that the proposed system has higher accuracy than others.
  • Keywords
    DNA; biology computing; data mining; genetics; learning (artificial intelligence); DNA sequences; artificial intelligence; bioinformatics; classification rules; deoxyribonucleic acid sequences; genomic sequences; hybrid learning system; knowledge discovery; leave-one-out method; oligo-nucleotides statistics; positional weight matrices; promoter recognition; transcription start site; Bioinformatics; Computer science; DNA; Genetics; Genomics; Knowledge engineering; Polymers; Proteins; RNA; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Applications, 2005. ICITA 2005. Third International Conference on
  • Print_ISBN
    0-7695-2316-1
  • Type

    conf

  • DOI
    10.1109/ICITA.2005.162
  • Filename
    1488848