DocumentCode :
46249
Title :
Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware
Author :
Xiangyuan Zhu ; Kenli Li ; Salah, Ahmad ; Lin Shi ; Keqin Li
Author_Institution :
Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
Volume :
12
Issue :
1
fYear :
2015
fDate :
Jan.-Feb. 1 2015
Firstpage :
205
Lastpage :
218
Abstract :
Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures. Due to the ever increasing sizes of sequence databases, there is increasing demand to accelerate this task. In this paper, we demonstrate how graphic processing units (GPUs), powered by the compute unified device architecture (CUDA), can be used as an efficient computational platform to accelerate the MAFFT algorithm. To fully exploit the GPU´s capabilities for accelerating MAFFT, we have optimized the sequence data organization to eliminate the bandwidth bottleneck of memory access, designed a memory allocation and reuse strategy to make full use of limited memory of GPUs, proposed a new modified-run-length encoding (MRLE) scheme to reduce memory consumption, and used high-performance shared memory to speed up I/O operations. Our implementation tested in three NVIDIA GPUs achieves speedup up to 11.28 on a Tesla K20m GPU compared to the sequential MAFFT 7.015.
Keywords :
bioinformatics; encoding; evolution (biological); genetics; graphics processing units; parallel architectures; trees (mathematics); CUDA-enabled graphics hardware; I/O operations; MAFFT algorithm; NVIDIA GPU; Tesla K20m GPU; bandwidth bottleneck elimination; biological applications; biological sequences; computational platform; compute unified device architecture; critical residue identification; graphic processing units; high-performance shared memory; memory consumption reduction; modified-run-length encoding scheme; multiple sequence alignment; parallel implementation; phylogenetic tree estimation; secondary structure prediction; sequence data organization; sequence databases; sequential MAFFT 7.015; sequential architectures; Acceleration; Algorithm design and analysis; Bioinformatics; Computational biology; Graphics processing units; IEEE transactions; Instruction sets; CUDA; GPGPU; MAFFT; graphics hardware; sequence alignment;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2014.2351801
Filename :
6883183
Link To Document :
بازگشت