• DocumentCode
    228699
  • Title

    Orion: Scaling Genomic Sequence Matching with Fine-Grained Parallelization

  • Author

    Mahadik, Kanak ; Chaterji, Somali ; Bowen Zhou ; Kulkarni, Milind ; Bagchi, Saurabh

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2014
  • fDate
    16-21 Nov. 2014
  • Firstpage
    449
  • Lastpage
    460
  • Abstract
    Gene sequencing instruments are producing huge volumes of data, straining the capabilities of current database searching algorithms and hindering efforts of researchers analyzing large collections of data to obtain greater insights. In the space of parallel genomic sequence search, most of the popular software packages, like mpiBLAST, use the database segmentation approach, wherein the entire database is sharded and searched on different nodes. However this approach does not scale well with the increasing length of individual query sequences as well as the rapid growth in size of sequence databases. In this paper, we propose a fine-grained parallelism technique, called Orion, that divides the input query into an adaptive number of fragments and shards the database. Our technique achieves higher parallelism (and hence speedup) and load balancing than database sharding alone, while maintaining 100% accuracy. We show that it is 12.3X faster than mpiBLAST for solving a relevant comparative genomics problem.
  • Keywords
    biology computing; database management systems; genomics; query processing; string matching; Orion; database segmentation; fine-grained parallelization; genomic sequence matching; query sequence; Bioinformatics; DNA; Databases; Genomics; Organisms; Parallel processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4799-5499-5
  • Type

    conf

  • DOI
    10.1109/SC.2014.42
  • Filename
    7013024