DocumentCode :
1634823
Title :
GAPK: Genetic algorithms with prior knowledge for motif discovery in DNA sequences
Author :
Wang, Dianhui ; Li, Xi
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., La Trobe Univ., Melbourne, VIC
fYear :
2009
Firstpage :
277
Lastpage :
284
Abstract :
Discovery of transcription factor binding sites (TFBSs) or DNA motifs in promoter regions of genes plays a key role in understanding the regulations of gene expression. In the past decade computational approaches, including evolutionary computation techniques, for searching for motifs have demonstrated good potential, and some results reported in literature are quite promising. Recently, some favorable progresses on evolutionary mining of motifs have been made and documented in GAME and GALF-P, where GAME employs a Bayesian-based scoring function and GALF-P aims to improve the algorithm performance with local filtering and adaptive post-processing. To improve discovering performance in terms of the recall, precision rates and algorithm reliability, this paper presents an alternative genetic algorithm termed as GAPK for resolving the problem of motifs discovery. In our proposed GAPK framework, a prior knowledge on motifs in a given dataset is used to initialize a population. Our technical contributions include a matrix representation for k-mers, a mismatch-based filtering method for search space reduction, a model mismatch score (MMS) as fitness function, new genetic operations and a model refinement processing. Some benchmarked datasets associated with eight transcription factors are used in our experiments. Comparative studies were carried out with well-known tools including GAME, GALF-P, MEME, MDScan and AlignACE. Results show that our method outperforms other techniques in terms of F-measure.
Keywords :
biocomputing; data mining; filtering theory; genetic algorithms; sequences; Bayesian-based scoring function; DNA motifs; DNA sequences; GAPK; deoxyribonucleic acid; evolutionary computation technique; evolutionary mining; fitness function; gene expression; genetic algorithm; matrix representation; mismatch-based filtering method; model mismatch score; model refinement processing; motif discovery; prior knowledge; search space reduction; transcription factor binding sites; Bioinformatics; Computer science; DNA; Data mining; Filtering algorithms; Gene expression; Genetic algorithms; Genomics; Problem-solving; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation, 2009. CEC '09. IEEE Congress on
Conference_Location :
Trondheim
Print_ISBN :
978-1-4244-2958-5
Electronic_ISBN :
978-1-4244-2959-2
Type :
conf
DOI :
10.1109/CEC.2009.4982959
Filename :
4982959
Link To Document :
بازگشت