• DocumentCode
    2702550
  • Title

    INDARE - An indexed DAG of regular expressions for selecting position frequency matrices

  • Author

    Park, Meeyoung ; Sanghvi, Jubin ; Dinakarpandian, Deendayal

  • Author_Institution
    Univ. of Missouri-Kansas City, Kansas City
  • fYear
    2007
  • fDate
    2-4 Nov. 2007
  • Firstpage
    191
  • Lastpage
    196
  • Abstract
    The identification of putative motifs in biomolecular sequences or whole genomes/proteomes is frequently based on window-based scanning with position frequency matrices (PFMs). The exponential increase in the amount of sequence data and the growing number of patterns to be screened has resulted in the need for rapid screening methods. In recognition of this, we have developed the Indexed DAG of regular expressions extractor (INDARE), a tool that dynamically extracts regular expressions (REs) for each PFM in the database, and creates a directed acyclic graph of REs. The INDARE generated DAG is very effective in pruning the search space and easily outperforms the naive exhaustive sequential search approach. The method is general enough to be applicable for the identification of motifs in any domain.
  • Keywords
    biology computing; molecular biophysics; INDARE tool; Indexed DAG of Regular Expressions Extractor sequential search approach; biomolecular sequences; genomes; position frequency matrices; proteomes; Bioinformatics; Cities and towns; Computer science; Data mining; Databases; Frequency; Genomics; Informatics; Inverse problems; Pattern recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshops, 2007. BIBMW 2007. IEEE International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-1-4244-1604-2
  • Type

    conf

  • DOI
    10.1109/BIBMW.2007.4425418
  • Filename
    4425418