Title of article :
Codon usage trajectories and 7-cluster structure of 143 complete bacterial genomic sequences
Author/Authors :
Alexander Gorban، نويسنده , , Tatyana Popova، نويسنده , , Andrey Zinovyev، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2005
Abstract :
Three results are presented. First, we prove the existence of a universal 7-cluster structure in all 143 completely sequenced bacterial genomes available in Genbank in August 2004, and explained its properties. The 7-cluster structure is responsible for the main part of sequence heterogeneity in bacterial genomes. In this sense, our 7 clusters is the basic model of bacterial genome sequence. We demonstrated that there are four basic “pure” types of this model, observed in nature: “parallel triangles”, “perpendicular triangles”, degenerated case and the flower-like type.
Second, we answered the question: how big are the position-specific information and the contribution connected with correlations between nucleotide. The accuracy of the mean-field (context-free) approximation is estimated for bacterial genomes.
We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy (more precisely, by two similar functions, one for eubacterial genomes and the other one for archaea). Description of these two codon-usage trajectories is the third result.
All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: http://www.ihes.fr/ zinovyev/7clusters.
Journal title :
Physica A Statistical Mechanics and its Applications
Journal title :
Physica A Statistical Mechanics and its Applications