DocumentCode :
3176895
Title :
A Memory-Efficient Multi-pattern Matching Algorithm Based on the Bitmap
Author :
Zhang, Weizhe ; Zhang, Yuanjing ; Zhang, HongLi ; Gu, Xuemai ; Cheng, Albert M K
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
fYear :
2009
fDate :
21-22 Dec. 2009
Firstpage :
1
Lastpage :
5
Abstract :
The Aho-Corasick algorithm is an efficient multiple pattern matching algorithm for large scale pattern sets. However, it consumes too much memory. A new efficient space optimization algorithm (AC-Bitmap) is proposed, which is based on the data structure of the bitmap. It divides all the states in the automata into two groups by their depths in the dictionary tree of all the patterns, and reduces the deeper group´s memory consumption which is retrieved less in matching. For the latter group, it also makes use of the bitmap to improve its retrieval time efficiency. Experiments indicate that the AC-Bitmap algorithm significantly reduces the memory consumption, which still keeps the time efficiency compared to the AC algorithm for random texts and literatures.
Keywords :
data structures; optimisation; pattern matching; text analysis; trees (mathematics); AC-Bitmap; Aho-Corasick algorithm; data structure; large scale pattern sets; memory consumption; memory-efficient multipattern matching algorithm; random texts; space optimization algorithm; Automata; Computer science; Data structures; Dictionaries; Information retrieval; Internet; Intrusion detection; Large-scale systems; Pattern matching; Space technology; AC-Bitmap algorithm; Aho-Corasick algorithm; multipattern matching; string matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet Computing for Science and Engineering (ICICSE), 2009 Fourth International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4244-6754-9
Type :
conf
DOI :
10.1109/ICICSE.2009.37
Filename :
5521643
Link To Document :
بازگشت