DocumentCode :
1763577
Title :
Rearrangement-Based Phylogeny Using the Single-Cut-or-Join Operation
Author :
Biller, Priscila ; Feijao, Pedro ; Meidanis, Joao
Author_Institution :
Inst. of Comput.., Univ. of Campinas, Campinas, Brazil
Volume :
10
Issue :
1
fYear :
2013
fDate :
Jan.-Feb. 2013
Firstpage :
122
Lastpage :
134
Abstract :
Recently, the Single-Cut-or-Join (SCJ) operation was proposed as a basis for a new rearrangement distance between multichromosomal genomes, leading to very fast algorithms, both in theory and in practice. However, it was not clear how well this new distance fares when it comes to using it to solve relevant problems, such as the reconstruction of evolutionary history. In this paper, we advance current knowledge, by testing SCJ´s ability regarding evolutionary reconstruction in two aspects: 1) How well does SCJ reconstruct evolutionary topologies? and 2) How well does SCJ reconstruct ancestral genomes? In the process of answering these questions, we implemented SCJ-based methods, and made them available to the community. We ran experiments using as many as 200 genomes, with as many as 3,000 genes. For the first question, we found out that SCJ can recover typically between 60 percent and more than 95 percent of the topology, as measured through the Robinson-Foulds distance (a.k.a. split distance) between trees. In other words, 60 percent to more than 95 percent of the original splits are also present in the reconstructed tree. For the second question, given a topology, SCJ´s ability to reconstruct ancestral genomes depends on how far from the leaves the ancestral is. For nodes close to the leaves, about 85 percent of the gene adjacencies can be recovered. This percentage decreases as we move up the tree, but, even at the root, about 50 percent of the adjacencies are recovered, for as many as 64 leaves. Our findings corroborate the fact that SCJ leads to very conservative genome reconstructions, yielding very few false-positive gene adjacencies in the ancestrals, at the expense of a relatively larger amount of false negatives. In addition, experiments with real data from the Campanulaceae and Protostomes groups show that SCJ reconstructs topologies of quality comparable to the accepted trees of the species involved. As far as time is concerned, the methods we impleme- ted can find a topology for 64 genomes with 2,000 genes each in about 10.7 minutes, and reconstruct the ancestral genomes in a 64-leaf tree in about 3 seconds, both on a typical desktop computer. It should be noted that our code is written in Java and we made no significant effort to optimize it.
Keywords :
Java; biology computing; cellular biophysics; evolution (biological); genetics; genomics; trees (mathematics); Campanulaceae groups; Java; Protostomes groups; Robinson-Foulds distance; ancestral genome reconstruction; desktop computer; evolutionary history reconstruction; evolutionary topology reconstruction; false-positive gene adjacencies; leaves; multichromosomal genomes; rearrangement distance; rearrangement-based phylogeny; single-cut-or-join operation; split distance; tree reconstruction; Biological cells; Extremities; Genomics; Phylogeny; Polynomials; Topology; Vegetation; Genome rearrangement; phylogeny; Animals; Campanulaceae; Computer Simulation; Evolution, Molecular; Gene Rearrangement; Genome; Genomics; Models, Genetic; Phylogeny; Software;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2012.168
Filename :
6389675
Link To Document :
بازگشت