Title :
Parsimony Score of Phylogenetic Networks: Hardness Results and a Linear-Time Heuristic
Author :
Jin, Guohua ; Nakhleh, Luay ; Snir, Sagi ; Tuller, Tamir
Author_Institution :
Dept. of Comput. Sci., Rice Univ., Houston, TX, USA
Abstract :
Phylogenies-the evolutionary histories of groups of organisms-play a major role in representing the interrelationships among biological entities. Many methods for reconstructing and studying such phylogenies have been proposed, almost all of which assume that the underlying history of a given set of species can be represented by a binary tree. Although many biological processes can be effectively modeled and summarized in this fashion, others cannot: recombination, hybrid speciation, and horizontal gene transfer result in networks of relationships rather than trees of relationships. In previous works, we formulated a maximum parsimony (MP) criterion for reconstructing and evaluating phylogenetic networks, and demonstrated its quality on biological as well as synthetic data sets. In this paper, we provide further theoretical results as well as a very fast heuristic algorithm for the MP criterion of phylogenetic networks. In particular, we provide a novel combinatorial definition of phylogenetic networks in terms of ldquoforbidden cycles,rdquo and provide detailed hardness and hardness of approximation proofs for the "smallrdquo MP problem. We demonstrate the performance of our heuristic in terms of time and accuracy on both biological and synthetic data sets. Finally, we explain the difference between our model and a similar one formulated by Nguyen et al., and describe the implications of this difference on the hardness and approximation results.
Keywords :
biological techniques; biology computing; combinatorial mathematics; evolution (biological); genetics; network theory (graphs); binary tree representation; biological relationship networks; evolutionary history; fast heuristic algorithm; forbidden cycles; hardness results; horizontal gene transfer; hybrid speciation; linear time heuristic; maximum parsimony criterion; phylogenetic network MP criterion; phylogenetic network combinatorial definition; phylogenetic network evaluation; phylogenetic network parsimony score; phylogenetic network reconstruction; phylogenies; recombination; Analysis of Algorithms and Problem Complexity; Maximum parsimony; Miscellaneous; hardness and approximation.; horizontal gene transfer; phylogenetic networks; Algorithms; Archaea; Cyclooxygenase 2; Models, Genetic; Phylogeny; Plant Proteins; Plants; Ribosomal Proteins;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2008.119