• DocumentCode
    3169416
  • Title

    Accelerating DNA analysis applications on GPU clusters

  • Author

    Tumeo, Antonino ; Villa, Oreste

  • Author_Institution
    High Performance Comput., Pacific Northwest Nat. Lab., Richland, WA, USA
  • fYear
    2010
  • fDate
    13-14 June 2010
  • Firstpage
    71
  • Lastpage
    76
  • Abstract
    DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. In this paper we present an efficient implementation of the Aho-Corasick algorithm for high performance clusters accelerated with Graphic Processing Units (GPUs). We discuss how we partitioned and adapted the algorithm to fit the Tesla C1060 GPU and then present a MPI based implementation for a heterogeneous high performance cluster. We compare this implementation to MPI and MPI with pthreads based implementations for a homogeneous cluster of x86 processors, discussing the stability vs. the performance and the scaling of the solutions, taking into consideration aspects such as the bandwidth among the different nodes.
  • Keywords
    biocomputing; bioinformatics; coprocessors; pattern matching; Aho-Corasick algorithm; DNA analysis; GPU clusters; MPI based implementation; Tesla C1060 GPU; bioinformatics; biology scientists; graphic processing units; multiple pattern matching algorithm; sequencing machinery; Acceleration; Bioinformatics; Clustering algorithms; DNA; Databases; Machinery; Partitioning algorithms; Pattern matching; Pattern recognition; Performance analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application Specific Processors (SASP), 2010 IEEE 8th Symposium on
  • Conference_Location
    Anaheim, CA
  • Print_ISBN
    978-1-4244-7953-5
  • Type

    conf

  • DOI
    10.1109/SASP.2010.5521145
  • Filename
    5521145