• DocumentCode
    26532
  • Title

    COSPEDTree: COuplet Supertree by Equivalence Partitioning of Taxa Set and DAG Formation

  • Author

    Bhattacharyya, Sourya ; Mukherjee, Jayanta

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, Kharagpur, India
  • Volume
    12
  • Issue
    3
  • fYear
    2015
  • fDate
    May-June 1 2015
  • Firstpage
    590
  • Lastpage
    603
  • Abstract
    From a set of phylogenetic trees with overlapping taxa set, a supertree exhibits evolutionary relationships among all input taxa. The key is to resolve the contradictory relationships with respect to input trees, between individual taxa subsets. Formulation of this NP hard problem employs either local search heuristics to reduce tree search space, or resolves the conflicts with respect to fixed or varying size subtree level decompositions. Different approximation techniques produce supertrees with considerable performance variations. Moreover, the majority of the algorithms involve high computational complexity, thus not suitable for use on large biological data sets. Current study presents COSPEDTree, a novel method for supertree construction. The technique resolves source tree conflicts by analyzing couplet (taxa pair) relationships for each source trees. Subsequently, individual taxa pairs are resolved with a single relation. To prioritize the consensus relations among individual taxa pairs for resolving them, greedy scoring is employed to assign higher score values for the consensus relations among a taxa pair. Selected set of relations resolving individual taxa pairs is subsequently used to construct a directed acyclic graph (DAG). Vertices of DAG represents a taxa subset inferred from the same speciation event. Thus, COSPEDTree can generate non-binary supertrees as well. Depth first traversal on this DAG yields final supertree. According to the performance metrics on branch dissimilarities (such as FP, FN and RF), COSPEDTree produces mostly conservative, well resolved supertrees. Specifically, RF metrics are mostly lower compared to the reference approaches, and FP values are lower apart from only strictly conservative (or veto) approaches. COSPEDTree has worst case time and space complexities of cubic and quadratic order, respectively, better or comparable to the reference approaches. Such high performance and low computational costs enable COSPEDTree to be - pplied on large scale biological data sets.
  • Keywords
    biology computing; computational complexity; evolution (biological); genetics; trees (mathematics); COSPEDTree; COuplet Supertree; DAG Formation; approximation techniques; biological data sets; computational complexity; couplet taxa pair relationships; cubic order; directed acyclic graph; equivalence partitioning; evolutionary relationships; local search heuristics; nonbinary supertrees; overlapping taxa set; phylogenetic trees; quadratic order; size subtree level decompositions; space complexities; taxa subsets; tree search space; Complexity theory; Computational biology; IEEE transactions; Materials requirements planning; Measurement; Phylogeny; Directed Acyclic Graph (DAG); Phylogenetics; Phylogenetics, Supertrees; directed acyclic graph (DAG); supertrees;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2366778
  • Filename
    6945851