DocumentCode
3169416
Title
Accelerating DNA analysis applications on GPU clusters
Author
Tumeo, Antonino ; Villa, Oreste
Author_Institution
High Performance Comput., Pacific Northwest Nat. Lab., Richland, WA, USA
fYear
2010
fDate
13-14 June 2010
Firstpage
71
Lastpage
76
Abstract
DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. In this paper we present an efficient implementation of the Aho-Corasick algorithm for high performance clusters accelerated with Graphic Processing Units (GPUs). We discuss how we partitioned and adapted the algorithm to fit the Tesla C1060 GPU and then present a MPI based implementation for a heterogeneous high performance cluster. We compare this implementation to MPI and MPI with pthreads based implementations for a homogeneous cluster of x86 processors, discussing the stability vs. the performance and the scaling of the solutions, taking into consideration aspects such as the bandwidth among the different nodes.
Keywords
biocomputing; bioinformatics; coprocessors; pattern matching; Aho-Corasick algorithm; DNA analysis; GPU clusters; MPI based implementation; Tesla C1060 GPU; bioinformatics; biology scientists; graphic processing units; multiple pattern matching algorithm; sequencing machinery; Acceleration; Bioinformatics; Clustering algorithms; DNA; Databases; Machinery; Partitioning algorithms; Pattern matching; Pattern recognition; Performance analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Application Specific Processors (SASP), 2010 IEEE 8th Symposium on
Conference_Location
Anaheim, CA
Print_ISBN
978-1-4244-7953-5
Type
conf
DOI
10.1109/SASP.2010.5521145
Filename
5521145
Link To Document