DocumentCode :
659423
Title :
GPU-accelerated adaptive compression framework for genomics data
Author :
GuiXin Guo ; Shuang Qiu ; Zhiqiang Ye ; Bingqiang Wang ; Lin Fang ; Mian Lu ; See, Solomon ; Rui Mao
Author_Institution :
BGI-Shenzhen, Shenzhen, China
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
181
Lastpage :
186
Abstract :
Genomics data is being produced at an unprecedented rate, especially in the context of clinical applications and grand challenge questions. There are various types of data in genomics research, most of which are stored as plain text tables. A data compression framework tailored to this file type is introduced in this paper, featuring a combination of generic compression algorithms, GPU acceleration, and column-major storage. This approach is the first to achieve both compression and decompression rates of around 100MB/s on commodity hardware without compromising compression ratio. By selecting appropriate compression schemes for each column of data, this framework efficiently exploits data redundancy while remaining applicable to a wide range of formats. The GPU-accelerated implementation also properly exploits the parallelism of compression algorithms. Finally, this paper presents a novel first-order Markov model based transformation, with evidence that it is at least as effective as Burrows-Wheeler and Move-To-Front in some contexts.
Keywords :
Markov processes; biology computing; data compression; genomics; graphics processing units; Burrows-Wheeler transformation; GPU acceleration; GPU-accelerated adaptive compression framework; column-major storage; commodity hardware; compression algorithms; compression rate; compression ratio; data compression framework; decompression rate; first-order Markov model based transformation; generic compression algorithms; genomics data; genomics research; graphics processing unit; move-to-front transformation; parallelism; Acceleration; Bioinformatics; Compression algorithms; Genomics; Graphics processing units; Markov processes; Transforms; GPU; Markov model; big data; data compression; parallel algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691572
Filename :
6691572
Link To Document :
بازگشت