• DocumentCode
    23064
  • Title

    Accelerating the Next Generation Long Read Mapping with the FPGA-Based System

  • Author

    Peng Chen ; Chao Wang ; Xi Li ; Xuehai Zhou

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    11
  • Issue
    5
  • fYear
    2014
  • fDate
    Sept.-Oct. 1 2014
  • Firstpage
    840
  • Lastpage
    852
  • Abstract
    To compare the newly determined sequences against the subject sequences stored in the databases is a critical job in the bioinformatics. Fortunately, recent survey reports that the state-of-the-art aligners are already fast enough to handle the ultra amount of short sequence reads in the reasonable time. However, for aligning the long sequence reads (>400 bp) generated by the next generation sequencing (NGS) technology, it is still quite inefficient with present aligners. Furthermore, the challenge becomes more and more serious as the lengths and the amounts of the sequence reads are both keeping increasing with the improvement of the sequencing technology. Thus, it is extremely urgent for the researchers to enhance the performance of the long read alignment. In this paper, we propose a novel FPGA-based system to improve the efficiency of the long read mapping. Compared to the state-of-the-art long read aligner BWA-SW, our accelerating platform could achieve a high performance with almost the same sensitivity. Experiments demonstrate that, for reads with lengths ranging from 512 up to 4,096 base pairs, the described system obtains a 10x -48x speedup for the bottleneck of the software. As to the whole mapping procedure, the FPGA-based platform could achieve a 1.8x -3:3x speedup versus the BWA-SW aligner, reducing the alignment cycles from weeks to days.
  • Keywords
    bioinformatics; dynamic programming; field programmable gate arrays; sequences; FPGA-based system; base pairs; bioinformatics; databases; dynamic programming; long read mapping; long sequence reads; next generation long read mapping acceleration; next-generation sequencing technology; short sequence reads; software; state-of-the-art aligners; state-of-the-art long read aligner BWA-SW; subject sequences; whole mapping procedure; Acceleration; Bioinformatics; Genomics; Indexes; Sequential analysis; Software; Software algorithms; BWA-SW; Sequence alignment; Smith-Waterman; hardware acceleration; long read mapping;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2326876
  • Filename
    6822570