Abstract :
This paper presents a novel motif discovery algorithm based on multi-objective genetic algorithms to extract non-dominated motifs in DNA sequences. The main advantage of our approach is that a large number of tradeoff (non-dominated) motifs can be obtained by a single run with respect to conflicting objectives: similarity, motif length and support maximization. In this paper, the method extracts non-dominated motifs taking into account two-objective at a time while one of the objectives is set to a pre-specified value. So, user is given to the authority of incorporating to motif discovery process. Our approach can be applied to any data set with a sequential character. Furthermore, it allows any choice of similarity measures for finding motifs. By analyzing the discovered non-dominated motifs, the decision maker can understand the tradeoff between the objectives. We compare the approach with the three well-known motif discovery methods, AlignACE, MEME and Weeder. Experimental results on real data set extracted from TRANSFAC database demonstrate that the proposed method exhibits good performance over the other methods in terms of runtime and accuracy of prediction.
Keywords :
DNA; biology; genetic algorithms; AlignACE; DNA sequences; MEME; TRANSFAC database; Weeder; decision maker; motif discovery algorithm; multi-objective genetic algorithm; nondominated motifs; support maximization; Bioinformatics; Cells (biology); DNA computing; Data engineering; Data mining; Genetic algorithms; Genetic engineering; Hybrid intelligent systems; Laboratories; Sequences;