• DocumentCode
    8157
  • Title

    Computing the Joint Distribution of Tree Shape and Tree Distance for Gene Tree Inference and Recombination Detection

  • Author

    Yujin Chung ; Perna, Nicole T. ; Ane, Cecile

  • Author_Institution
    Dept. of Stat., Univ. of Wisconsin, Madison, WI, USA
  • Volume
    10
  • Issue
    5
  • fYear
    2013
  • fDate
    Sept.-Oct. 2013
  • Firstpage
    1
  • Lastpage
    1
  • Abstract
    Ancestral recombination events can cause the underlying genealogy of a site to vary along the genome. We consider Bayesian models to simultaneously detect recombination breakpoints in very long sequence alignments and estimate the phylogenetic tree of each block between breakpoints. The models we consider use a dissimilarity measure between trees in their prior distribution to favor similar trees at neighboring loci. We show empirical evidence in Enterobacteria that neighboring genomic regions have similar trees. The main hurdle to using such models is the need to properly calculate the normalizing function for the prior probabilities on trees. In this work, we quantify the impact of approximating this normalizing function as done in biomc2, a hierarchical Bayesian method to detect recombination based on distance between tree topologies. We then derive an algorithm to calculate the normalizing function exactly, for a Gibbs distribution based on the Robinson-Foulds (RF) distance between gene trees at neighboring loci. At the core is the calculation of the joint distribution of the shape of a random tree and its RF distance to a fixed tree. We also propose fast approximations to the normalizing function, which are shown to be very accurate with little impact on the Bayesian inference.
  • Keywords
    Bayes methods; bioinformatics; evolution (biological); genetics; genomics; inference mechanisms; microorganisms; statistical distributions; Bayesian inference; Enterobacteria; Gibbs distribution; Robinson-Foulds distance; gene tree distance; gene tree inference detection; gene tree recombination detection; gene tree shape; gene tree topologies; genealogy; genomic regions; hierarchical Bayesian method; neighboring loci; normalizing function calculation; phylogenetic tree estimation; prior probabilities; sequence alignments; Bayes methods; Bioinformatics; Biological system modeling; Genomics; Phylogeny; Radio frequency; Bioinformatics; Biological system modeling; Genomics; Phylogeny; Radio frequency; Robinson-Foulds distance; Topology; Vegetation; gene tree discordance; normalizing funcction; phylogenetic tree; recombination;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2013.109
  • Filename
    6600684