DocumentCode :
3169416
Title :
Accelerating DNA analysis applications on GPU clusters
Author :
Tumeo, Antonino ; Villa, Oreste
Author_Institution :
High Performance Comput., Pacific Northwest Nat. Lab., Richland, WA, USA
fYear :
2010
fDate :
13-14 June 2010
Firstpage :
71
Lastpage :
76
Abstract :
DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. In this paper we present an efficient implementation of the Aho-Corasick algorithm for high performance clusters accelerated with Graphic Processing Units (GPUs). We discuss how we partitioned and adapted the algorithm to fit the Tesla C1060 GPU and then present a MPI based implementation for a heterogeneous high performance cluster. We compare this implementation to MPI and MPI with pthreads based implementations for a homogeneous cluster of x86 processors, discussing the stability vs. the performance and the scaling of the solutions, taking into consideration aspects such as the bandwidth among the different nodes.
Keywords :
biocomputing; bioinformatics; coprocessors; pattern matching; Aho-Corasick algorithm; DNA analysis; GPU clusters; MPI based implementation; Tesla C1060 GPU; bioinformatics; biology scientists; graphic processing units; multiple pattern matching algorithm; sequencing machinery; Acceleration; Bioinformatics; Clustering algorithms; DNA; Databases; Machinery; Partitioning algorithms; Pattern matching; Pattern recognition; Performance analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Application Specific Processors (SASP), 2010 IEEE 8th Symposium on
Conference_Location :
Anaheim, CA
Print_ISBN :
978-1-4244-7953-5
Type :
conf
DOI :
10.1109/SASP.2010.5521145
Filename :
5521145
Link To Document :
بازگشت