• DocumentCode
    952135
  • Title

    Consensus Genetic Maps as Median Orders from Inconsistent Sources

  • Author

    Jackson, Benjamin N. ; Schnable, Patrick S. ; Aluru, Srinivas

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA
  • Volume
    5
  • Issue
    2
  • fYear
    2008
  • Firstpage
    161
  • Lastpage
    171
  • Abstract
    A genetic map is an ordering of genetic markers calculated from a population of known lineage. Although, traditionally, a map has been generated from a single population for each species, recently, researchers have created maps from multiple populations. In the face of these new data, we address the need to find a consensus map - a map that combines the information from multiple partial and possibly inconsistent input maps. We model each input map as a partial order and formulate the consensus problem as finding a median partial order. Finding the median of multiple total orders (preferences or rankings) is a well-studied problem in social choice. We choose to find the median by using the weighted symmetric difference distance, which is a more general version of both the symmetric difference distance and the Kemeny distance. Finding a median order using this distance is NP-hard. We show that, for our chosen weight assignment, a median order satisfies the positive responsiveness, extended Condorcet, and unanimity criteria. Our solution involves finding the maximum acyclic subgraph of a weighted directed graph. We present a method that dynamically switches between an exact branch and bound algorithm and a heuristic algorithm and show that, for real data from closely related organisms, an exact median can often be found. We present experimental results by using seven populations of the crop plant Zea mays.
  • Keywords
    biochemistry; biology computing; botany; cellular biophysics; directed graphs; genetics; heuristic programming; molecular biophysics; optimisation; Kemeny distance; NP-hard; Zea mays crop plant; closely related organisms; genetic map; genetic markers ordering; heuristic algorithm; known lineage population; maximum acyclic subgraph; median partial order; single population species; unanimity criteria; weighted directed graph; weighted symmetric difference distance; Genetic map; Kemeny distance; median order; path and circuit problems; symmetric difference distance.; Algorithms; Chromosome Mapping; Computational Biology; Consensus Sequence; Software;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2007.70221
  • Filename
    4359879