Title :
Sequential modeling for identifying CpG island locations in human genome
Author :
Dasgupta, Nilanjan ; Lin, Simon ; Carin, Lawrence
Author_Institution :
Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
Abstract :
We consider several sequential processing algorithms for identifying genes in human DNA, based on detecting CpG ("C proceeds G") islands. The algorithms are designed to capture the underlying statistical structure in a DNA sequence. Sequential processing using a Markov model and a hidden Markov model are shown to identify most CpG islands in annotated (marked) DNA subsequences available from publicly available DNA datasets. We also consider a wavelet-based hidden Markov tree (HMT). In the context of the HMT, we address design of adaptive wavelets matched to CpG islands, this accomplished via lifting and genetic-algorithm optimization.
Keywords :
DNA; genetic algorithms; genetics; hidden Markov models; medical signal processing; wavelet transforms; CpG island locations; DNA sequence; HMT; Markov model; adaptive wavelets; annotated DNA subsequences; genes; genetic-algorithm; hidden Markov model; human DNA; human genome; optimization; sequential modeling; sequential processing algorithms; statistical structure; wavelet-based hidden Markov tree; Algorithm design and analysis; Bioinformatics; Chemicals; DNA; Design optimization; Genomics; Hidden Markov models; Humans; Sequences; Signal processing algorithms;
Journal_Title :
Signal Processing Letters, IEEE
DOI :
10.1109/LSP.2002.806062