Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs

Author

Cheng-Hung Lin ; Chen-Hsiung Liu ; Lung-Sheng Chien ; Shih-Chieh Chang

Author_Institution

Dept. of Technol. Applic. & Human Resource Dev., Nat. Taiwan Normal Univ., Taipei, Taiwan

Volume

62

Issue

10

fYear

2013

fDate

Oct. 2013

Firstpage

1906

Lastpage

1916

Abstract

Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.

Keywords

buffer storage; finite state machines; graphics processing units; multiprocessing systems; optimisation; parallel algorithms; parallel architectures; parallel memories; pattern matching; peripheral interfaces; security of data; table lookup; Aho-Corasick machine; DEFCON; GPU; Google Code; NVIDIA; OpenMP; Snort V2.8; attack pattern; bit rate 143.16 Gbit/s; buffer; exact pattern matching algorithm; frequency 3.06 GHz; graphics processing unit; massive data parallel computing; optimization; parallel algorithm; parallel architecture; peripheral component interconnect express; quad core CPU; state machine traversing; throughput; transition table lookup; Acceleration; Algorithm design and analysis; Complexity theory; Graphics processing unit; Instruction sets; Pattern matching; Vectors; Aho-Corasick; Graphics processing units; parallel algorithm; pattern matching;

fLanguage

English

Journal_Title

Computers, IEEE Transactions on

Publisher

ieee

ISSN

0018-9340

Type

jour

DOI

10.1109/TC.2012.254

Filename

6338923