• DocumentCode
    3176895
  • Title

    A Memory-Efficient Multi-pattern Matching Algorithm Based on the Bitmap

  • Author

    Zhang, Weizhe ; Zhang, Yuanjing ; Zhang, HongLi ; Gu, Xuemai ; Cheng, Albert M K

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • fYear
    2009
  • fDate
    21-22 Dec. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    The Aho-Corasick algorithm is an efficient multiple pattern matching algorithm for large scale pattern sets. However, it consumes too much memory. A new efficient space optimization algorithm (AC-Bitmap) is proposed, which is based on the data structure of the bitmap. It divides all the states in the automata into two groups by their depths in the dictionary tree of all the patterns, and reduces the deeper group´s memory consumption which is retrieved less in matching. For the latter group, it also makes use of the bitmap to improve its retrieval time efficiency. Experiments indicate that the AC-Bitmap algorithm significantly reduces the memory consumption, which still keeps the time efficiency compared to the AC algorithm for random texts and literatures.
  • Keywords
    data structures; optimisation; pattern matching; text analysis; trees (mathematics); AC-Bitmap; Aho-Corasick algorithm; data structure; large scale pattern sets; memory consumption; memory-efficient multipattern matching algorithm; random texts; space optimization algorithm; Automata; Computer science; Data structures; Dictionaries; Information retrieval; Internet; Intrusion detection; Large-scale systems; Pattern matching; Space technology; AC-Bitmap algorithm; Aho-Corasick algorithm; multipattern matching; string matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Internet Computing for Science and Engineering (ICICSE), 2009 Fourth International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4244-6754-9
  • Type

    conf

  • DOI
    10.1109/ICICSE.2009.37
  • Filename
    5521643