DocumentCode :
244464
Title :
Burrows-Wheeler Transform based indexed exact search on a multi-GPU OpenCL platform
Author :
Nogueira, David ; Tomas, Pedro ; Roma, Nuno
Author_Institution :
Inst. Super. Tecnico, Univ. de Lisboa, Lisbon, Portugal
fYear :
2014
fDate :
21-25 July 2014
Firstpage :
31
Lastpage :
38
Abstract :
A multi-GPU parallelization of exact string matching algorithms based on the backward-search procedure by using indexing techniques, such as the Burrows-Wheeler Transform and the FM-Index, is proposed in this paper. To attain an efficient execution on highly heterogeneous parallel platforms, the proposed parallelization adopted an unified OpenCL implementation that allows its execution either in CPUs and in multiple and possibly different GPU devices (e.g., NVIDIA and AMD GPUs) that integrate the targeted platform. Furthermore, the proposed implementation incorporates convenient load-balancing techniques, in order to ensure not only a convenient balance of the involved workload to minimize the resulting processing time, but also the possibility to scale the offered throughput with the number of exploited GPUs. The obtained experimental results showed that the proposed multi-GPU parallelization platform is able to offer significant speedups (greater than 10×, when using one single GPU) when compared to conventional mainstream multi-threaded CPU implementations (Bowtie - 8 threads), and between 5× and 30× when compared to other popular BWT-based aligners, namely BWA and SOAP2, using their multi-threading options. When compared with state of the art GPU implementations (e.g., SOAP3, HPG-BWT, Barracuda and CUSHAW2-GPU), the proposed implementation showed to be able to provide speedups between 2.5× and 5×. The execution of the proposed alignment platform when considering multiple and completely distinct GPU devices demonstrated the ability to efficiently scale the resulting throughput, by offering a convenient load-balancing of the involved processing in the several distinct devices.
Keywords :
graphics processing units; indexing; information retrieval; multi-threading; multiprocessing systems; string matching; transforms; AMD GPU; Burrows-Wheeler transform; FM index; NVIDIA GPU; backward-search procedure; exact string matching algorithms; graphics processing unit; highly heterogeneous parallel platforms; indexed exact search; indexing techniques; load balancing techniques; multi-GPU OpenCL platform; multi-GPU parallelization; multi-threading options; parallelization; Arrays; Buffer storage; Graphics processing units; Indexes; Instruction sets; Kernel; Burrow-Wheeler Transform; FM-index; Graphics Processing Unit (GPU); OpenCL; Parallel Computing; exact string matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing & Simulation (HPCS), 2014 International Conference on
Conference_Location :
Bologna
Print_ISBN :
978-1-4799-5312-7
Type :
conf
DOI :
10.1109/HPCSim.2014.6903666
Filename :
6903666
Link To Document :
بازگشت