DocumentCode :
932318
Title :
An efficient algorithm for the identification of structured motifs in DNA promoter sequences
Author :
Carvalho, A.M. ; Freitas, A.T. ; Oliveira, A.L. ; Sagot, M.-F.
Author_Institution :
INESC, Lisbon
Volume :
3
Issue :
2
fYear :
2006
Firstpage :
126
Lastpage :
140
Abstract :
We propose a new algorithm for identifying cis-regulatory modules in genomic sequences. The proposed algorithm, named RISO, uses a new data structure, called box-link, to store the information about conserved regions that occur in a well-ordered and regularly spaced manner in the data set sequences. This type of conserved regions, called structured motifs, is extremely relevant in the research of gene regulatory mechanisms since it can effectively represent promoter models. The complexity analysis shows a time and space gain over the best known exact algorithms that is exponential in the spacings between binding sites. A full implementation of the algorithm was developed and made available online. Experimental results show that the algorithm is much faster than existing ones, sometimes by more than four orders of magnitude. The application of the method to biological data sets shows its ability to extract relevant consensi
Keywords :
DNA; biology computing; computational complexity; genetics; molecular biophysics; molecular configurations; DNA promoter sequences; RISO; binding sites; box-link; cis-regulatory modules; complexity analysis; gene regulatory mechanisms; genomic sequences; structured motif identification; Algorithm design and analysis; Bioinformatics; Biological system modeling; DNA; Data mining; Data structures; Genomics; Organisms; Predictive models; Sequences; Box-link; binding site consensus.; factor tree; promoter; structured motif; Algorithms; Bacteria; Binding Sites; Computational Biology; Cyclic AMP Receptor Protein; DNA; DNA-Directed RNA Polymerases; Humans; Internet; Promoter Regions, Genetic; Response Elements; Saccharomyces cerevisiae; Sigma Factor; Transcription Factors;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2006.16
Filename :
1631994
Link To Document :
بازگشت