Title of article :
Statistics of trinucleotides in coding sequences and evolution
Author/Authors :
Takeuchi، نويسنده , , Fumihiko and Futamura، نويسنده , , Yasuhiro and Yoshikura، نويسنده , , Hiroshi and Yamamoto، نويسنده , , Kenji، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2003
Pages :
11
From page :
139
To page :
149
Abstract :
The aim of this paper is to give measurements indicative of evolutional stages of the species. Two types of statistics of trinucleotides in coding regions are analysed for 27 species. rst one is the codon space, the nucleotide ratio for each of the three codon positions. We apply principal component analysis on this space and extract two principal components faithfully describing the original distribution of the codon space. The first principal component corresponds to the GC content. The second principal component classifies the species into three evolutional groups, Archaea, Bacteria and Eukaryota. cond statistics is the real and theoretical frequency of amino acids. The real frequency of an amino acid in a coding sequence is its frequency in the translated protein. The theoretical frequency is the expected frequency calculated from the ratio of nucleotides. We introduce the discrepancy between these two frequencies as an index of non-randomness of nucleotides in the sequence. This index of non-randomness divides the species into two groups: eukaryotes having smaller non-randomness (i.e. being more random) and prokaryotes having higher non-randomness.
Keywords :
tRNA abundance , Codon space , Principal component analysis , Statistics of Trinucleotides , Theoretical amino acid frequency , Coding sequence , codon usage , Evolution , randomness
Journal title :
Journal of Theoretical Biology
Serial Year :
2003
Journal title :
Journal of Theoretical Biology
Record number :
1535777
Link To Document :
بازگشت