DocumentCode
1754982
Title
Algorithms for Genome-Scale Phylogenetics Using Gene Tree Parsimony
Author
Bansal, Mukul S. ; Eulenstein, Oliver
Author_Institution
Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
Volume
10
Issue
4
fYear
2013
fDate
July-Aug. 2013
Firstpage
939
Lastpage
956
Abstract
The use of genomic data sets for phylogenetics is complicated by the fact that evolutionary processes such as gene duplication and loss, or incomplete lineage sorting (deep coalescence) cause incongruence among gene trees. One well-known approach that deals with this complication is gene tree parsimony, which, given a collection of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, a lack of efficient algorithms has limited the use of this approach. Here, we present efficient algorithms for SPR and TBR-based local search heuristics for gene tree parsimony under the 1) duplication, 2) loss, 3) duplication-loss, and 4) deep coalescence reconciliation costs. These novel algorithms improve upon the time complexities of previous algorithms for these problems by a factor of n, where n is the number of species in the collection of gene trees. Our algorithms provide a substantial improvement in runtime and scalability compared to previous implementations and enable large-scale gene tree parsimony analyses using any of the four reconciliation costs. Our algorithms have been implemented in the software packages DupTree and iGTP, and have already been used to perform several compelling phylogenetic studies.
Keywords
evolution (biological); genetics; genomics; software packages; trees (mathematics); DupTree software packages; SPR-based local search heuristics; TBR-based local search heuristics; deep coalescence reconciliation; gene duplication-loss; gene tree parsimony; genome-scale phylogenetics; genomic data sets; iGTP software packages; Algorithm design and analysis; Bioinformatics; Complexity theory; Genomics; Phylogeny; Search problems; Vegetation; Gene tree parsimony; gene duplication; gene loss; incomplete lineage sorting; minimizing deep coalescences (MDC); phylogenetics; phylogenomics;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2013.103
Filename
6583159
Link To Document