Title of article :
Euclidean Distance Analysis Enables Nucleotide Skew Analysis in Viral Genomes
Author/Authors :
Hemert, Formijn van University of Amsterdam - Amsterdam, Netherlands , Jebbink, Maarten University of Amsterdam - Amsterdam, Netherlands , van der Ark, Andries University of Amsterdam - Amsterdam, Netherlands , Scholer, Frits University of Amsterdam - Meibergdreef - Amsterdam, Netherlands , Berkhout, Ben University of Amsterdam - Amsterdam, Netherlands
Abstract :
Nucleotide skew analysis is a versatile method to study the nucleotide composition of RNA/DNA molecules, in particular to
reveal characteristic sequence signatures. For instance, skew analysis of the nucleotide bias of several viral RNA genomes
indicated that it is enriched in the unpaired, single-stranded genome regions, thus creating an even more striking virusspecific signature. +e comparison of skew graphs for many virus isolates or families is difficult, time-consuming, and
nonquantitative. Here, we present a procedure for a more simple identification of similarities and dissimilarities between
nucleotide skew data of coronavirus, flavivirus, picornavirus, and HIV-1 RNA genomes. Window and step sizes were
normalized to correct for differences in length of the viral genome. Cumulative skew data are converted into pairwise
Euclidean distance matrices, which can be presented as neighbor-joining trees. We present skew value trees for the four virus
families and show that closely related viruses are placed in small clusters. Importantly, the skew value trees are similar to the
trees constructed by a “classical” model of evolutionary nucleotide substitution. +us, we conclude that the simple calculation
of Euclidean distances between nucleotide skew data allows an easy and quantitative comparison of characteristic sequence
signatures of virus genomes. +ese results indicate that the Euclidean distance analysis of nucleotide skew data forms a nice
addition to the virology toolbox.
Keywords :
Analysis , Genomes , Skew , HIV-1
Journal title :
Computational and Mathematical Methods in Medicine