Title :
GPU Acceleration of Pyrosequencing Noise Removal
Author :
Gao, Yang ; Bakos, Jason D.
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Carolina, Columbia, SC, USA
Abstract :
Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise´s noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.
Keywords :
graphics processing units; matrix algebra; signal denoising; AmpliconNoise software package; CPU; CUDA-SeqDist speedup; DNA sequencing algorithm; GFCUPS; GPU acceleration; GPU kernel implementation; Intel Core i7 980; MPI ranks; NVIDIA Tesla C2070; OTU; chimera detection; frequency 3.33 GHz; giga floating-point cell updates per second; global sequence alignment method; operational taxonomic units; pairwise distances; pyrosequencing noise removal; score matrix; specialized alignment algorithm; Graphics processing unit; Instruction sets; Kernel; Memory management; Optimization; Registers; Throughput; Amplicon Noise; CUDA; GPU; GPU Computing; Heterogeneous Computing; MPI; Metagenomics; Needleman-Wunsch; Pyronoise; Sequence Alignment; Short Reads; Smith-Waterman;
Conference_Titel :
Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on
Conference_Location :
Chicago IL
Print_ISBN :
978-1-4673-2882-1
DOI :
10.1109/SAAHPC.2012.15