DocumentCode :
3453427
Title :
Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression
Author :
Shuai Yuan ; Zhaohui Qin
Author_Institution :
Math. & Comput. Sci. Dept., Emory Univ., Atlanta, GA, USA
fYear :
2012
fDate :
4-7 Oct. 2012
Firstpage :
718
Lastpage :
724
Abstract :
Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)´s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.eom/p/diploid-mapping/downloads/list.
Keywords :
RNA; biology computing; genetics; genomics; graphics processing units; molecular biophysics; molecular configurations; C++ compute shader source code; DirectX 11 enabled graphics processing unit; GPU compute shader source code; RNA sequencing data reduced bias; allele-specific expression detection; genetics research; genomics research; next generation sequencing technologies; parallel computing power; personalized diploid reference genome; read mapping; single uniform reference genome; software packages; Accuracy; Bioinformatics; Biological cells; Error analysis; Genomics; Software; Allele specific expression; GPU programming; RNA-sequencing; read mapping; reference genome; single nucleotide polymorphism;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4673-2746-6
Electronic_ISBN :
978-1-4673-2744-2
Type :
conf
DOI :
10.1109/BIBMW.2012.6470225
Filename :
6470225
Link To Document :
بازگشت