• DocumentCode
    659608
  • Title

    Parallel and memory-efficient Burrows-Wheeler transform

  • Author

    Hayashi, Shin´ichiro ; Taura, Koichi

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    43
  • Lastpage
    50
  • Abstract
    In large scale applications of genome analysis, compressed indexes are very useful to avoid the memory usage imposed by suffix arrays. FM-Index is one of the most important such compressed indexes. While it is very memory efficient, its construction requires much larger working memory during construction than the final index itself. Specifically, the Burrows-Wheeler transform (BWT), which is computed on its way to constructing FM-Index, requires a suffix array, which is larger than BWT and FM-Index. To address this, there have been many efforts to compute the Burrows-Wheeler in a smaller space, one of the fastest of which achieves a linear time complexity. Since it is unlikely that any sequential algorithm achieves a sublinear time complexity, the next logical step is a parallel algorithm that achieves a sublinear critical path with a space requirement smaller than suffix arrays. In this paper, we propose such an algorithm. It is based on a divide-and-conquer algorithm, which can be efficiently executed with task parallel models. We evaluated our algorithm on human genome data and obtained almost the same speed with BWT-IS, which is the known fastest algorithm.
  • Keywords
    biology computing; computational complexity; genomics; parallel algorithms; transforms; BWT; BWT-IS algorithm; Burrows-Wheeler transform; FM-Index; compressed indexes; divide-and-conquer algorithm; genome analysis; human genome data; linear time complexity; parallel algorithm; sequential algorithm; space requirement; suffix arrays; task parallel models; Arrays; Bioinformatics; Genomics; Indexes; Merging; Time complexity; Transforms; Burrows-Wheeler transform; parallel computing; suffix array;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691757
  • Filename
    6691757