DocumentCode :
2957919
Title :
A Parallel Algorithm for Spectrum-based Short Read Error Correction
Author :
Shah, Ankit R. ; Chockalingam, Sriram ; Aluru, Srinivas
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Bombay, Mumbai, India
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
60
Lastpage :
70
Abstract :
Correcting sequence errors in high-throughput DNA sequencing by taking advantage of redundant sampling and low error rates is often an important first step in applications of this technology. Consequently, a number of error correction methods have been developed in the recent years. Due to an order of magnitude throughput gain per year, some of these technologies are now generating upwards of a billion reads per run. In this paper, we present an algorithm for parallel zing error correction methods that are based on frequency spectrum of kmers observed in input reads. Based on this, we present a parallelization of Reptile, a recently introduced error correction method that employs frequency spectrum of two different lengths, one for identifying correction possibilities and another for providing contextual information. Our method is well suited for distributed memory parallel computers and clusters. Experimental results indicate the method achieves near linear speedup and provides the ability to scale to larger data sets than previously demonstrated.
Keywords :
biocomputing; error correction; parallel algorithms; DNA sequencing; Reptile; low error rates; parallel algorithm; parallel zing error correction methods; redundant sampling; sequence errors; spectrum-based short read error correction; Computers; DNA; Error analysis; Error correction; Genomics; Hamming distance; Throughput; genome assembly; high-throughput sequencing; next-gen sequencing; parallel error correction; sequence base calling; short read error correction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-4673-0975-2
Type :
conf
DOI :
10.1109/IPDPS.2012.16
Filename :
6267824
Link To Document :
بازگشت