• DocumentCode
    16436
  • Title

    An Algorithm for Constructing Principal Geodesics in Phylogenetic Treespace

  • Author

    Nye, Tom M. W.

  • Author_Institution
    Sch. of Math. & Stat., Newcastle Univ., Newcastle upon Tyne, UK
  • Volume
    11
  • Issue
    2
  • fYear
    2014
  • fDate
    March-April 2014
  • Firstpage
    304
  • Lastpage
    315
  • Abstract
    Most phylogenetic analyses result in a sample of trees, but summarizing and visualizing these samples can be challenging. Consensus trees often provide limited information about a sample, and so methods such as consensus networks, clustering and multidimensional scaling have been developed and applied to tree samples. This paper describes a stochastic algorithm for constructing a principal geodesic or line through treespace which is analogous to the first principal component in standard principal components analysis. A principal geodesic summarizes the most variable features of a sample of trees, in terms of both tree topology and branch lengths, and it can be visualized as an animation of smoothly changing trees. The algorithm performs a stochastic search through parameter space for a geodesic which minimizes the sum of squared projected distances of the data points. This procedure aims to identify the globally optimal principal geodesic, though convergence to locally optimal geodesics is possible. The methodology is illustrated by constructing principal geodesics for experimental and simulated data sets, demonstrating the insight into samples of trees that can be gained and how the method improves on a previously published approach. A java package called GeoPhytter for constructing and visualizing principal geodesics is freely available from www.ncl.ac.uk/ ntmwn/geophytter.
  • Keywords
    biology computing; data visualisation; differential geometry; genetics; genomics; pattern clustering; principal component analysis; stochastic processes; GeoPhytter; branch lengths; consensus networks; consensus trees; data points; first principal component; globally optimal principal geodesic; java package; locally optimal geodesics; multidimensional scaling; parameter space; pattern clustering; phylogenetic analyses; principal geodesics visualization; smoothly changing tree animation; squared projected distances; standard principal components analysis; stochastic algorithm; stochastic search; tree samples; tree topology; treespace; Bioinformatics; Computational biology; Measurement; Phylogeny; Principal component analysis; Topology; Vegetation; Phylogeny; principal components analysis; treespace;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2309599
  • Filename
    6755452