Title :
A Parallel Algorithm for the Best k-Mismatches Alignment Problem
Author :
Del Fabbro, Cristian ; Tardivo, Fabio ; Policriti, Alberto
Author_Institution :
Inst. of Appl. Genomics, Udine, Italy
Abstract :
We propose a parallel algorithm that solves the best k-mismatches alignment problem against a genomic reference using the "one sequence/multiple processes" paradigm and distributed memory. Our proposal is designed to take advantage of a computing cluster using MPI (Message Passing Interface) for communication. Our solution distributes the reference among different nodes and each sequence is processed concurrently by different nodes. When a (putative) best solution is found, the successful process propagates the information to other nodes, reducing search space and saving computation time. The distributed algorithm was developed in C++ and optimized for the PLX and FERMI supercomputers, but it is compatible with every OpenMPI-based cluster. It was included in the ERNE (Extended Randomized Numerical alignEr) package, whose aim is to provide an all-inclusive set of tools for short reads alignment and cleaning. ERNE is free software, distributed under the Open Source License (GPL V3) and can be downloaded at: http://erne.sourceforge.net. The algorithm described in this work is implemented in the ERNE-PMAP and ERNE-PBS5 programs, the former designed to align DNA and RNA sequences, while the latter is optimized for bisulphite-treated sequences.
Keywords :
C++ language; DNA; RNA; application program interfaces; bioinformatics; distributed memory systems; message passing; parallel algorithms; public domain software; software packages; C++ programming; DNA sequence alignment; ERNE-PBS5 programs; ERNE-PMAP programs; Extended Randomized Numerical aligner package; FERMI supercomputers; GPL V3; MPI; OpenMPI-based cluster; PLX supercomputers; RNA sequence alignment; best k-mismatches alignment problem; bisulphite-treated sequences; computation time; computing cluster; distributed algorithm; distributed memory; free software; message passing interface; one sequence-multiple processes paradigm; open source license; parallel algorithm; search space reduction; Bioinformatics; Clustering algorithms; DNA; Genomics; Next generation networking; Sequential analysis; Software; bs-seq; erne; mpi; parallel; sequence alignment;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
Conference_Location :
Torino
DOI :
10.1109/PDP.2014.51