DocumentCode :
2667950
Title :
Memory-Efficient Regular Expression Search Using State Merging
Author :
Becchi, Michela ; Cadambi, Srihari
Author_Institution :
Washington Univ., St. Louis
fYear :
2007
fDate :
6-12 May 2007
Firstpage :
1064
Lastpage :
1072
Abstract :
Pattern matching is a crucial task in several critical network services such as intrusion detection and policy management. As the complexity of rule-sets increases, traditional string matching engines are being replaced by more sophisticated regular expression engines. To keep up with line rates, deal with denial of service attacks and provide predictable resource provisioning, the design of such engines must allow examining payload traffic at several gigabits per second and provide worst case speed guarantees. While regular expression matching using deterministic finite automata (DFA) is a well studied problem in theory, its implementation either in software or specialized hardware is complicated by prohibitive memory requirements. This is especially true for DFAs representing complex regular expressions present in practical rule-sets. In this paper, we introduce a novel method to drastically reduce the DFA memory requirement and still provide worst-case speed guarantees. Specifically, we merge several "non-equivalent" states in a DFA by introducing labels on their input and output transitions. We then propose a data structure to represent the merged states and the transition labels. We show that, with very few assumptions about the original DFA, such a transformation results in significant compression in the DFA representation. We have implemented a state merging and transition labeling algorithm for DFAs, and show that for Snort and Bro security rule-sets, state merging results in memory reductions of an order of magnitude.
Keywords :
finite automata; search problems; security of data; telecommunication security; DFA memory requirement; critical network services; data structure; denial of service attacks; deterministic finite automata; intrusion detection; line rates; memory-efficient regular expression search; merged states; non-equivalent states; pattern matching; payload traffic; policy management; regular expression engines; regular expression matching; resource provisioning; rule-sets; state merging; string matching engines; transition labeling algorithm; transition labels; Automata; Computer crime; Data structures; Doped fiber amplifiers; Engines; Hardware; Intrusion detection; Merging; Pattern matching; Payloads;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE
Conference_Location :
Anchorage, AK
ISSN :
0743-166X
Print_ISBN :
1-4244-1047-9
Type :
conf
DOI :
10.1109/INFCOM.2007.128
Filename :
4215710
Link To Document :
بازگشت