DocumentCode :
1809642
Title :
Workshop: Robust algorithms for inferring haplotype phase and deletion polymorphism from high-throughput whole genome sequence data
Author :
Aguiar, Derek ; Istrail, Sorin
Author_Institution :
Dept. of Comput. Sci. & Center for Comput. Mol. Biol., Brown Univ., Providence, RI, USA
fYear :
2012
fDate :
23-25 Feb. 2012
Firstpage :
1
Lastpage :
1
Abstract :
Genetic heterogeneity of rare mutations with severe effects is more commonly being viewed as a major component of disease[1]. Autism is an excellent example where research is active in identifying matches between the phenotypic and genomic heterogeneities. A substantial portion of autism appears to be correlated with copy number variation which is not directly probed by high-throughput next generation sequencing (NGS) or single nucleotide polymorphism (SNP) array technologies[2]. Furthermore, de novo and mapping based genome assembly methods produce phase ambiguous assemblies due to the limitations of current sequencing technologies. As a result, phase-dependent interactions between SNP variants may hide complex genetic heterogeneities associated with disease. Thus, identifying the genetic heterogeneity of complex disease remains a major unresolved computational problem due, in part, to the inability of algorithms to detect small deletions and phase single nucleotide polymorphism. In the first part of this talk, we will present an algorithmic framework, termed DELISHUS, that implements a highly efficient algorithm for inferring genomic deletions of all sizes and frequencies in SNP array data. The core of the algorithm is a de facto polynomial time backtracking algorithm - that finishes on a 1 billion entry genome-wide association study SNP matrix in a few minutes - to compute all potential inherited deletions in a dataset. With very few modifications, DELISHUS may also infer regions that contain de novo deletions. We will show that DELISHUS has significantly higher sensitivity and specificity than previously developed methods and present a genome-wide deletion map of autism. DELISHUS may be run with SNP array or NGS data. In the second part of the talk, we will present our recent work on haplotype assembly of NGS data using our HAPCOMPASS algorithm. We suggest two new metrics for evaluating the quality of a haplotype assembly that do not require knowledge of the tr- e haplotypes. Finally, we will show that HAPCOMPASS performs significantly better than the Genome Analysis ToolKit[3] and HapCut[4] for 1000 genomes data as well as simulated data for a variety of metrics.
Keywords :
diseases; genetics; genomics; medical computing; polymorphism; DELISHUS; Genome Analysis ToolKit; HAPCOMPASS algorithm; HapCut; NGS; SNP array; autism; deletion polymorphism; disease; genetic heterogeneity; haplotype phase; high-throughput whole genome sequence data; next generation sequencing; polynomial time backtracking algorithm; single nucleotide polymorphism; Arrays; Assembly; Autism; Bioinformatics; Genomics; deletion inference; genomic deletions; haplotype assembly; haplotype phasing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Advances in Bio and Medical Sciences (ICCABS), 2012 IEEE 2nd International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4673-1320-9
Electronic_ISBN :
978-1-4673-1319-3
Type :
conf
DOI :
10.1109/ICCABS.2012.6182665
Filename :
6182665
Link To Document :
بازگشت