DocumentCode :
5582
Title :
GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU
Author :
Xinyan Zha ; Sahni, Shashank
Author_Institution :
Comput. & Inf. Sci. & Eng, Univ. of Florida, Gainesville, FL, USA
Volume :
62
Issue :
6
fYear :
2013
fDate :
Jun-13
Firstpage :
1156
Lastpage :
1169
Abstract :
We develop GPU adaptations of the Aho-Corasick and multipattern Boyer-Moore string matching algorithms for the two cases GPU-to-GPU (input to the algorithms is initially in GPU memory and the output is left in GPU memory) and host-to-host (input and output are in the memory of the host CPU). For the GPU-to-GPU case, we consider several refinements to a base GPU implementation and measure the performance gain from each refinement. For the host-to-host case, we analyze two strategies to communicate between the host and the GPU and show that one is optimal with respect to runtime while the other requires less device memory. This analysis is done for GPUs with one I/O channel to the host as well as those with 2. Experiments conducted on an NVIDIA Tesla GT200 GPU that has 240 cores running off of a Xeon 2.8 GHz quad-core host CPU show that, for the GPU-to-GPU case, our Aho-Corasick GPU adaptation achieves a speedup between 8.5 and 9.5 relative to a single-thread CPU implementation and between 2.4 and 3.2 relative to the best multithreaded implementation. For the host-to-host case, the GPU AC code achieves a speedup of 3.1 relative to a single-threaded CPU implementation. However, the GPU is unable to deliver any speedup relative to the best multithreaded code running on the quad-core host. In fact, the measured speedups for the latter case ranged between 0.74 and 0.83. Early versions of our multipattern Boyer-Moore adaptations ran 7 to 10 percent slower than corresponding versions of the AC adaptations and we did not refine the multipattern Boyer-Moore codes further.
Keywords :
graphics processing units; multi-threading; pattern matching; Boyer-Moore adaptations; GPU AC code; GPU adaptations; GPU memory; GPU-to-GPU; I/O channel; NVIDIA Tesla GT200 GPU; device memory; host-to-host multipattern string matching; multipattern Boyer-Moore string matching algorithms; multithreaded implementation; Arrays; Bandwidth; Dictionaries; Doped fiber amplifiers; Graphics processing unit; Instruction sets; Pattern matching; Aho-Corasick; CUDA; GPU; Multipattern string matching; multipattern Boyer-Moore;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2012.61
Filename :
6165263
Link To Document :
بازگشت