DocumentCode
3740
Title
A Memory-Efficient and Modular Approach for Large-Scale String Pattern Matching
Author
Hoang Le ; Prasanna, Viktor K.
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of Southern California, Los Angeles, CA, USA
Volume
62
Issue
5
fYear
2013
fDate
May-13
Firstpage
844
Lastpage
857
Abstract
In Network Intrusion Detection Systems (NIDSs), string pattern matching demands exceptionally high performance to match the content of network traffic against a predefined database (or dictionary) of malicious patterns. Much work has been done in this field; however, most of the prior work results in low memory efficiency (defined as the ratio of the amount of the required storage in bytes and the size of the dictionary in number of characters). Due to such inefficiency, state-of-the-art designs cannot support large dictionaries without using high-latency external DRAM. We propose an algorithm called "leaf-attaching" to preprocess a given dictionary without increasing the number of patterns. The resulting set of postprocessed patterns can be searched using any tree-search data structure. We also present a scalable, high-throughput, Memory-efficient Architecture for large-scale String Matching (MASM) based on a pipelined binary search tree. The proposed algorithm and architecture achieve a memory efficiency of 0.56 (for the Rogets dictionary) and 1.32 (for the Snort dictionary). As a result, our design scales well to support larger dictionaries. Implementations on 45 nm ASIC and a state-of-the-art FPGA device (for latest Rogets and Snort dictionaries) show that our architecture achieves 24 and 3.2 Gbps, respectively. The MASM module can simply be duplicated to accept multiple characters per cycle, leading to scalable throughput with respect to the number of characters processed in each cycle. Dictionary update involves simply rewriting the content of the memory, which can be done quickly without reconfiguring the chip.
Keywords
Internet; application specific integrated circuits; computer network security; field programmable gate arrays; storage management; string matching; tree searching; ASIC; FPGA device; MASM module; NIDS; Rogets dictionary; Snort dictionary; characters per cycle; content matching; dictionary dictionary; dictionary preprocessing; dictionary update; high-latency external DRAM; large-scale string pattern matching; leaf-attaching; malicious pattern dictionary; memory content rewriting; modular approach; network intrusion detection systems; network traffic; pipelined binary search tree; predefined database; scalable high-throughput memory-efficient architecture; tree-search data structure; Databases; Dictionaries; Memory management; Pattern matching; Throughput; Vectors; ASIC; Aho-Corasick; DFA; Databases; Dictionaries; Memory management; Pattern matching; Rogets; Snort; String matching; Throughput; Vectors; field-programmable gate array (FPGA); leaf attaching; pipeline; reconfigurable;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2012.38
Filename
6148214
Link To Document