DocumentCode
2702550
Title
INDARE - An indexed DAG of regular expressions for selecting position frequency matrices
Author
Park, Meeyoung ; Sanghvi, Jubin ; Dinakarpandian, Deendayal
Author_Institution
Univ. of Missouri-Kansas City, Kansas City
fYear
2007
fDate
2-4 Nov. 2007
Firstpage
191
Lastpage
196
Abstract
The identification of putative motifs in biomolecular sequences or whole genomes/proteomes is frequently based on window-based scanning with position frequency matrices (PFMs). The exponential increase in the amount of sequence data and the growing number of patterns to be screened has resulted in the need for rapid screening methods. In recognition of this, we have developed the Indexed DAG of regular expressions extractor (INDARE), a tool that dynamically extracts regular expressions (REs) for each PFM in the database, and creates a directed acyclic graph of REs. The INDARE generated DAG is very effective in pruning the search space and easily outperforms the naive exhaustive sequential search approach. The method is general enough to be applicable for the identification of motifs in any domain.
Keywords
biology computing; molecular biophysics; INDARE tool; Indexed DAG of Regular Expressions Extractor sequential search approach; biomolecular sequences; genomes; position frequency matrices; proteomes; Bioinformatics; Cities and towns; Computer science; Data mining; Databases; Frequency; Genomics; Informatics; Inverse problems; Pattern recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine Workshops, 2007. BIBMW 2007. IEEE International Conference on
Conference_Location
Fremont, CA
Print_ISBN
978-1-4244-1604-2
Type
conf
DOI
10.1109/BIBMW.2007.4425418
Filename
4425418
Link To Document