Title :
Insight into DNA periodicity by a single-channel sequence data approach
Author :
Zoltowski, Mariusz
Author_Institution :
CM UMK, Bydgoszcz, Poland
fDate :
Aug. 30 2011-Sept. 3 2011
Abstract :
It has not been obvious how to map a genomic sequence into the numbers to elucidate its periodicities by digital signal processing (DSP) in accord with the underlying biology [1]. The well known DNA spectra and their extensions appear the A-T-C-G - base-wise by Fourier (FT), wavelet (WT) or related transforms of the indicatory functions (IF-s) of these bases. The IF assumes either 1-in the presence or 0-in the absence of the indicated base in sequence. The IF´s spectra can be next combined in a different way including the optimal one to provide the net spectrum [2]. In this contribution, it is attempted, and not limited to; showing that single channel numeric DNA also turns out to be sufficient for biologically meaningful results by DSP with accompanying merits. Plausibility is possible considering any RNA message as a single-channel coded waveform; by the triplets of the codon bases which code for 20 different amino acids. This in turn enables a clear justification for the coding rhythm in terms of the codon usage frequency (CUF) and the gene series autocorrelation. The latter simply assesses a self-similarity of the message. Along with appending well established communication insight to biological perspectives, the answer to how the genetic code is becoming specific, inducing the self-similarity of the coded sequences under the three-base-shift case is addressed. Supporting the focus, there are some findings in vertebrates´ genes data elucidated by the EMD of Huang-Hilbert transform (H-HT) [3]; these are long-term spectra relevant to the coding, the content of dicodons and the structural properties of coded proteins [4]. Also a new finding in the coding rhythm - the one which is attributed to the coding DNA, is included. This is the net coding rhythms in Homo sapiens, Homo sapiens house-keeping and vertebrates´ genes comparison by histograms of adaptively tracked amplitudes case. It is intriguing how spectral features of genomic sequences correspond to related- physical phenomena [5-8].
Keywords :
DNA; Hilbert transforms; RNA; circadian rhythms; genetics; genomics; molecular biophysics; proteins; DNA periodicity; DNA spectra; EMD; Fourier transforms; Homo sapiens; Huang-Hilbert transform; RNA message; amino acids; biological perspective; coded proteins; coded sequence; coding DNA; coding rhythm; codon base; codon usage frequency; dicodons; digital signal processing; gene series autocorrelation; genetic code; genomic sequence; physical phenomena; single channel numeric DNA; single-channel coded waveform; single-channel sequence data approach; spectral feature; structural properties; vertebrate genes; wavelet transforms; Amino acids; Bioinformatics; Encoding; Finite impulse response filter; Genomics; Proteins; Rhythm; Algorithms; Base Sequence; DNA; Molecular Sequence Data; Sequence Alignment; Sequence Analysis, DNA; Signal Processing, Computer-Assisted;
Conference_Titel :
Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-4121-1
Electronic_ISBN :
1557-170X
DOI :
10.1109/IEMBS.2011.6090678