• DocumentCode
    1504498
  • Title

    Pure Parsimony Xor Haplotyping

  • Author

    Bonizzoni, Paola ; Vedova, Gianluca Della ; Dondi, Riccardo ; Pirola, Yuri ; Rizzi, Romeo

  • Author_Institution
    Dipt. di Statistica, Univ. Degli Studi di Milano-Bicocca, Milano, Italy
  • Volume
    7
  • Issue
    4
  • fYear
    2010
  • Firstpage
    598
  • Lastpage
    610
  • Abstract
    The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given single nucleotide polymorphisms (SNP). Finally, we propose a heuristic and produce an experimental analysis showing that it scales to real-world large instances taken from the HapMap project.
  • Keywords
    genetics; genomics; polynomial approximation; HapMap project; combinatorial properties; fixed-parameter algorithm; genetic studies; graph representation; haplotype inference; haplotype resolution; heterozygous sites; heuristic; homozygous sites; polynomial time algorithms; polynomial time k-approximation; pure parsimony xor haplotyping; xor-genotype data; Approximation algorithms; Biological cells; Biological system modeling; Biology computing; Data mining; Genetics; Inference algorithms; Organisms; Phylogeny; Polynomials; Algorithms; approximation algorithms; graph representation.; haplotype resolution; pure parsimony; Algorithms; Computational Biology; Genotype; Haplotypes; Heterozygote; Polymorphism, Single Nucleotide;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2010.52
  • Filename
    5473211