Title of article :
A Study of the Middle-scale Nucleotide Clustering in DNA Sequences of Various Origin and Functionality, by means of a Method based on a Modified Standard Deviation
Author/Authors :
NIKOLAOU، نويسنده , , CHRISTOFOROS and ALMIRANTIS، نويسنده , , YANNIS، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2002
Abstract :
The deviation from randomness in the distribution of nucleotides in genomic sequences is quantified and studied, using a modified standard deviation (MSD). This method implies a “per block” computation of the standard deviation of the nucleotide frequencies of occurrence, using local means (means taken in a neighborhood of each block). This quantity may serve as a scale-dependent measure of the nucleotide clustering. In the present work, the meso-scale of tenths of nucleotides is principally explored, by means of suitably adjusted filter parameters. This length scale is of an order of magnitude not directly affected by the grammar and syntax rules of the protein-coding procedure, remaining shorter than the scale of appearance of large-scale characteristics of the genome. MSD has been found to distinguish systematically between the sequences of different origin and functionality. The most near-random are found to be coding sequences of prokaryotes, while in intronic and intergenic regions of eukaryotic genomes, extended clustering of similar nucleotides is observed. The distributions of MSD values of large collections of sequences are found to be in most cases characteristic of their biological role and origin. Protein- and non-coding, prokaryotic and eukaryotic DNA as well as promoter, rRNA, viral and organelle sequences have been examined. The presented results corroborate a recently proposed model for genome evolution. The method is also applied for an assessment of the annotation of ORFs taken from the complete genome of Saccharomyces cerevisiae.
Journal title :
Journal of Theoretical Biology
Journal title :
Journal of Theoretical Biology