Title :
Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs
Author :
Cheng-Hung Lin ; Chen-Hsiung Liu ; Lung-Sheng Chien ; Shih-Chieh Chang
Author_Institution :
Dept. of Technol. Applic. & Human Resource Dev., Nat. Taiwan Normal Univ., Taipei, Taiwan
Abstract :
Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.
Keywords :
buffer storage; finite state machines; graphics processing units; multiprocessing systems; optimisation; parallel algorithms; parallel architectures; parallel memories; pattern matching; peripheral interfaces; security of data; table lookup; Aho-Corasick machine; DEFCON; GPU; Google Code; NVIDIA; OpenMP; Snort V2.8; attack pattern; bit rate 143.16 Gbit/s; buffer; exact pattern matching algorithm; frequency 3.06 GHz; graphics processing unit; massive data parallel computing; optimization; parallel algorithm; parallel architecture; peripheral component interconnect express; quad core CPU; state machine traversing; throughput; transition table lookup; Acceleration; Algorithm design and analysis; Complexity theory; Graphics processing unit; Instruction sets; Pattern matching; Vectors; Aho-Corasick; Graphics processing units; parallel algorithm; pattern matching;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2012.254