• DocumentCode
    125656
  • Title

    A Parallel Algorithm for the Best k-Mismatches Alignment Problem

  • Author

    Del Fabbro, Cristian ; Tardivo, Fabio ; Policriti, Alberto

  • Author_Institution
    Inst. of Appl. Genomics, Udine, Italy
  • fYear
    2014
  • fDate
    12-14 Feb. 2014
  • Firstpage
    586
  • Lastpage
    589
  • Abstract
    We propose a parallel algorithm that solves the best k-mismatches alignment problem against a genomic reference using the "one sequence/multiple processes" paradigm and distributed memory. Our proposal is designed to take advantage of a computing cluster using MPI (Message Passing Interface) for communication. Our solution distributes the reference among different nodes and each sequence is processed concurrently by different nodes. When a (putative) best solution is found, the successful process propagates the information to other nodes, reducing search space and saving computation time. The distributed algorithm was developed in C++ and optimized for the PLX and FERMI supercomputers, but it is compatible with every OpenMPI-based cluster. It was included in the ERNE (Extended Randomized Numerical alignEr) package, whose aim is to provide an all-inclusive set of tools for short reads alignment and cleaning. ERNE is free software, distributed under the Open Source License (GPL V3) and can be downloaded at: http://erne.sourceforge.net. The algorithm described in this work is implemented in the ERNE-PMAP and ERNE-PBS5 programs, the former designed to align DNA and RNA sequences, while the latter is optimized for bisulphite-treated sequences.
  • Keywords
    C++ language; DNA; RNA; application program interfaces; bioinformatics; distributed memory systems; message passing; parallel algorithms; public domain software; software packages; C++ programming; DNA sequence alignment; ERNE-PBS5 programs; ERNE-PMAP programs; Extended Randomized Numerical aligner package; FERMI supercomputers; GPL V3; MPI; OpenMPI-based cluster; PLX supercomputers; RNA sequence alignment; best k-mismatches alignment problem; bisulphite-treated sequences; computation time; computing cluster; distributed algorithm; distributed memory; free software; message passing interface; one sequence-multiple processes paradigm; open source license; parallel algorithm; search space reduction; Bioinformatics; Clustering algorithms; DNA; Genomics; Next generation networking; Sequential analysis; Software; bs-seq; erne; mpi; parallel; sequence alignment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
  • Conference_Location
    Torino
  • ISSN
    1066-6192
  • Type

    conf

  • DOI
    10.1109/PDP.2014.51
  • Filename
    6787334