Title :
An efficient algorithm for the identification of structured motifs in DNA promoter sequences
Author :
Carvalho, A.M. ; Freitas, A.T. ; Oliveira, A.L. ; Sagot, M.-F.
Author_Institution :
INESC, Lisbon
Abstract :
We propose a new algorithm for identifying cis-regulatory modules in genomic sequences. The proposed algorithm, named RISO, uses a new data structure, called box-link, to store the information about conserved regions that occur in a well-ordered and regularly spaced manner in the data set sequences. This type of conserved regions, called structured motifs, is extremely relevant in the research of gene regulatory mechanisms since it can effectively represent promoter models. The complexity analysis shows a time and space gain over the best known exact algorithms that is exponential in the spacings between binding sites. A full implementation of the algorithm was developed and made available online. Experimental results show that the algorithm is much faster than existing ones, sometimes by more than four orders of magnitude. The application of the method to biological data sets shows its ability to extract relevant consensi
Keywords :
DNA; biology computing; computational complexity; genetics; molecular biophysics; molecular configurations; DNA promoter sequences; RISO; binding sites; box-link; cis-regulatory modules; complexity analysis; gene regulatory mechanisms; genomic sequences; structured motif identification; Algorithm design and analysis; Bioinformatics; Biological system modeling; DNA; Data mining; Data structures; Genomics; Organisms; Predictive models; Sequences; Box-link; binding site consensus.; factor tree; promoter; structured motif; Algorithms; Bacteria; Binding Sites; Computational Biology; Cyclic AMP Receptor Protein; DNA; DNA-Directed RNA Polymerases; Humans; Internet; Promoter Regions, Genetic; Response Elements; Saccharomyces cerevisiae; Sigma Factor; Transcription Factors;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2006.16