Title :
The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI-Based Local Searches
Author :
Bansal, Mukul S. ; Eulenstein, Oliver ; Wehe, André
Author_Institution :
Dept. of Comput. Sci., Iowa State Univ., Ames, IA
Abstract :
The gene-duplication problem is to infer a species supertree from a collection of gene trees that are confounded by complex histories of gene-duplication events. This problem is NP-complete and thus requires efficient and effective heuristics. Existing heuristics perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. A classical local search problem is the NNI search problem, which is based on the nearest neighbor interchange operation. In this work, we 1) provide a novel near-linear time algorithm for the NNI search problem, 2) introduce extensions that significantly enlarge the search space of the NNI search problem, and 3) present algorithms for these extended versions that are asymptotically just as efficient as our algorithm for the NNI search problem. The exceptional speedup achieved in the extended NNI search problems makes the gene-duplication problem more tractable for large-scale phylogenetic analyses. We verify the performance of our algorithms in a comparison study using sets of large randomly generated gene trees.
Keywords :
biology computing; computational complexity; genetics; genomics; molecular biophysics; optimisation; search problems; NNI-based local searches; NP-complete problem; gene trees; gene-duplication problem; heuristics; large-scale phylogenetic analysis; nearest neighbor interchange; nearlinear time algorithms; Computational phylogenetics; gene-duplication; local search; supertrees; {tt NNI}.; Algorithms; Animals; Computational Biology; Gene Duplication; Models, Genetic; Phylogeny; Sequence Analysis, DNA;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2009.7