DocumentCode :
8157
Title :
Computing the Joint Distribution of Tree Shape and Tree Distance for Gene Tree Inference and Recombination Detection
Author :
Yujin Chung ; Perna, Nicole T. ; Ane, Cecile
Author_Institution :
Dept. of Stat., Univ. of Wisconsin, Madison, WI, USA
Volume :
10
Issue :
5
fYear :
2013
fDate :
Sept.-Oct. 2013
Firstpage :
1
Lastpage :
1
Abstract :
Ancestral recombination events can cause the underlying genealogy of a site to vary along the genome. We consider Bayesian models to simultaneously detect recombination breakpoints in very long sequence alignments and estimate the phylogenetic tree of each block between breakpoints. The models we consider use a dissimilarity measure between trees in their prior distribution to favor similar trees at neighboring loci. We show empirical evidence in Enterobacteria that neighboring genomic regions have similar trees. The main hurdle to using such models is the need to properly calculate the normalizing function for the prior probabilities on trees. In this work, we quantify the impact of approximating this normalizing function as done in biomc2, a hierarchical Bayesian method to detect recombination based on distance between tree topologies. We then derive an algorithm to calculate the normalizing function exactly, for a Gibbs distribution based on the Robinson-Foulds (RF) distance between gene trees at neighboring loci. At the core is the calculation of the joint distribution of the shape of a random tree and its RF distance to a fixed tree. We also propose fast approximations to the normalizing function, which are shown to be very accurate with little impact on the Bayesian inference.
Keywords :
Bayes methods; bioinformatics; evolution (biological); genetics; genomics; inference mechanisms; microorganisms; statistical distributions; Bayesian inference; Enterobacteria; Gibbs distribution; Robinson-Foulds distance; gene tree distance; gene tree inference detection; gene tree recombination detection; gene tree shape; gene tree topologies; genealogy; genomic regions; hierarchical Bayesian method; neighboring loci; normalizing function calculation; phylogenetic tree estimation; prior probabilities; sequence alignments; Bayes methods; Bioinformatics; Biological system modeling; Genomics; Phylogeny; Radio frequency; Bioinformatics; Biological system modeling; Genomics; Phylogeny; Radio frequency; Robinson-Foulds distance; Topology; Vegetation; gene tree discordance; normalizing funcction; phylogenetic tree; recombination;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2013.109
Filename :
6600684
Link To Document :
بازگشت