Title :
Exploring parallelism in short sequence mapping using Burrows-Wheeler Transform
Author :
Doruk Bozdağ;Ayat Hatem;Umit V. Catalyurek
Author_Institution :
Department of Biomedical Informatics, The Ohio State University, Columbus, 43210, USA
fDate :
4/1/2010 12:00:00 AM
Abstract :
Next-generation high throughput sequencing instruments are capable of generating hundreds of millions of reads in a single run. Mapping those reads to a reference genome is an extremely compute-intensive process that takes more than a day on a modern computer even when the accuracy of the results is traded off to speed up the execution. In this work, we explore various data distribution strategies for parallel execution of three state-of-the-art mapping tools, namely Bowtie, BWA and SOAP2, that are based on the Burrows-Wheeler Transformation. We report on the performance of these strategies and show that the best strategy depends on the input scenario as well as the relative efficiency of the tools in the indexing and matching steps of the mapping process. The parallelization strategies investigated in this paper are general and can easily be applied to different mapping algorithms. With the availability of parallel execution methods, it will be possible to carry out more intensive computations that cannot be accomplished in a reasonable time using sequential tools, including mapping with larger mismatch tolerance.
Keywords :
"Sequences","Genomics","Bioinformatics","Parallel processing","Distribution strategy","Throughput","Instruments","DNA","Biomedical informatics","Biomedical computing"
Conference_Titel :
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
Print_ISBN :
978-1-4244-6533-0
DOI :
10.1109/IPDPSW.2010.5470897