• DocumentCode
    2085086
  • Title

    A Multi-dimensional Progressive Perfect Hashing for High-Speed String Matching

  • Author

    Xu, Yang ; Ma, Lei ; Liu, Zhaobo ; Chao, H. Jonathan

  • Author_Institution
    ECE Dept., NYU, Brooklyn, NY, USA
  • fYear
    2011
  • fDate
    3-4 Oct. 2011
  • Firstpage
    167
  • Lastpage
    177
  • Abstract
    Aho-Corasick (AC) automaton is widely used for multi-string matching in today´s Network Intrusion Detection System (NIDS). With fast-growing rule sets, implementing AC automaton with a small memory without sacrificing its performance has remained challenging in NIDS design. In this paper, we propose a multi-dimensional progressive perfect hashing algorithm named P2-Hashing, which allows transitions of an AC automaton to be placed in a compact hash table without any collision. P2-Hashing is based on the observation that a hash key of each transition consists of two dimensions, namely a source state ID and an input character. When placing a transition in a hash table and causing a collision, we can change the value of a dimension of the hash key to rehash the transition to a new location of the hash table. For a given AC automaton, P2-Hashing first divides all the transitions into many small sets based on the two-dimensional values of the hash keys, and then places the sets of transitions progressively into the hash table until all are placed. Hash collisions that occurred during the insertion of a transition will only affect the transitions in the same set. The proposed P2-Hashing has many unique properties, including fast hash index generation and zero memory overhead, which are very suitable for the AC automaton operation. The feasibility and performance of P2-Hashing are investigated through simulations on the full Snort (6.4k rules) and Clam AV (54k rules) rule sets, each of which is first converted to a single AC automaton. Simulation results show that P2-Hashing can successfully construct the perfect hash table even when the load factor of the hash table is as high as 0.91.
  • Keywords
    automata theory; computer network security; file organisation; set theory; string matching; Aho-Corasick automaton; NIDS design; P2-hashing algorithm; fast hash index generation; hash table; high-speed string matching; multidimensional progressive perfect hashing algorithm; network intrusion detection system; rule sets; source state ID; zero memory overhead; Algorithm design and analysis; Automata; Bipartite graph; Indexes; Memory management; Security; Throughput; Aho-Corasick Automaton; Hash Collision; Multi-string Matching; Perfect Hash Table;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Architectures for Networking and Communications Systems (ANCS), 2011 Seventh ACM/IEEE Symposium on
  • Conference_Location
    Brooklyn, NY
  • Print_ISBN
    978-1-4577-1454-2
  • Electronic_ISBN
    978-0-7695-4521-9
  • Type

    conf

  • DOI
    10.1109/ANCS.2011.33
  • Filename
    6062729