Abstract :
Pattern matching is one of the most important components for the content inspection based applications of network security, and it requires well designed algorithms and architectures to keep up with the increasing network speed. For most of the solutions, AC and its derivative algorithms are widely used. They are based on the DFA model but utilize large amount of memory because of so many transition rules. An algorithm, called ACC, is presented in this paper for multiple pattern matching. It uses a novel model, namely cached deterministic finite automate (CDFA). In ACC, by using CDFA, only 4.1% transition rules for ClamAV (20.8% for Snort) are needed to represent the same function using DFA built by AC. This paper also proposes a new scheme named next-state addressing (NSA) to store and access transition rules of DFA in memory. Using this method, transition rules can be efficiently stored and directly accessed. Finally the architecture for multiple pattern matching is optimized by several approaches. Experiments show our architecture can achieve matching speed faster than 10 Gbps with very efficient memory utilization, i.e., 81KB memory for 1.8 K Snort rules with total 29 K characters, and 9.5 MB memory for 50 K ClamAV rules with total 4.44 M characters. A single architecture is memory efficient for large pattern set, and it is possible to support more than 10 M patterns with at most half amount of the memory utilization compared to the state-of-the-art architectures.
Keywords :
cache storage; computer networks; deterministic automata; finite automata; pattern matching; telecommunication security; ACC algorithm; cached deterministic finite automate; computer network security; content inspection based application; memory efficient multiple pattern matching architecture; next-state addressing scheme; Algorithm design and analysis; Application software; Communications Society; Computer architecture; Computer science; Computer security; Doped fiber amplifiers; Inspection; Intrusion detection; Pattern matching;