DocumentCode :
1791710
Title :
Towards integrating the detection of genetic variants into an in-memory database
Author :
Fahnrich, Cindy ; Schapranow, Matthieu P. ; Plattner, Hasso
Author_Institution :
Enterprise Platform & Integration Concepts, Hasso Plattner Inst., Potsdam, Germany
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
27
Lastpage :
32
Abstract :
Next-generation sequencing enables whole genome sequencing within a few hours at a minimum of cost, entailing advanced medical applications such as personalized treatments. However, this recent technology imposes new challenges to alignment and variant calling as subsequent analysis steps. Compared to former sequencing, both must deal with an increasing amount of data to process at a significantly lower data quality - and are currently not capable of that. In this work, we focus on addressing these challenges for identifying Single Nucleotide Polymorphisms, i.e. SNP calling, in genome data as one subtask of variant calling. We propose the application of a column-store in-memory database for efficient data processing and apply the statistical model that is provided by the Genome Analysis Toolkit´s UnifiedGenotyper. Comparisons with the UnifiedGenotyper show that our approach can exploit all computational resources available and accelerates SNP calling up to a factor of 22x.
Keywords :
data analysis; genetics; medical computing; statistical analysis; SNP; UnifiedGenotyper; column-store in-memory database; data processing; data quality; genetic variant detection; genome analysis toolkit; genome data; genome sequencing; next-generation sequencing; personalized treatments; single nucleotide polymorphisms; statistical model; variant calling; Bioinformatics; Biological cells; Databases; Genomics; Instruction sets; Runtime; Sequential analysis; Genome Data Analysis; In-Memory Database Technology; Next-Generation Sequencing; Single Nucleotide Polymorphism; Variant Calling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004389
Filename :
7004389
Link To Document :
بازگشت