Title :
Pure Parsimony Xor Haplotyping
Author :
Bonizzoni, Paola ; Vedova, Gianluca Della ; Dondi, Riccardo ; Pirola, Yuri ; Rizzi, Romeo
Author_Institution :
Dipt. di Statistica, Univ. Degli Studi di Milano-Bicocca, Milano, Italy
Abstract :
The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given single nucleotide polymorphisms (SNP). Finally, we propose a heuristic and produce an experimental analysis showing that it scales to real-world large instances taken from the HapMap project.
Keywords :
genetics; genomics; polynomial approximation; HapMap project; combinatorial properties; fixed-parameter algorithm; genetic studies; graph representation; haplotype inference; haplotype resolution; heterozygous sites; heuristic; homozygous sites; polynomial time algorithms; polynomial time k-approximation; pure parsimony xor haplotyping; xor-genotype data; Approximation algorithms; Biological cells; Biological system modeling; Biology computing; Data mining; Genetics; Inference algorithms; Organisms; Phylogeny; Polynomials; Algorithms; approximation algorithms; graph representation.; haplotype resolution; pure parsimony; Algorithms; Computational Biology; Genotype; Haplotypes; Heterozygote; Polymorphism, Single Nucleotide;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2010.52