DocumentCode :
2718039
Title :
Fast string matching with space-efficient word graphs
Author :
Yata, Susumu ; Morita, Kazuhiro ; Fuketa, Masao ; Aoe, Jun-Ichi
Author_Institution :
Inst. of Technol. & Sci., Univ. of Tokushima, Tokushima
fYear :
2008
fDate :
16-18 Dec. 2008
Firstpage :
79
Lastpage :
83
Abstract :
String matching is one of the fundamentals in various text-processing applications such as text mining and content filtering systems. This paper describes a fast string matching algorithm using a compact pattern matching machine DAWG. A directed acyclic word graph (DAWG) is traditionally implemented with a 2-dimensional linked list or matrix. However, DAWGs with these structures have drawbacks, the lookup time of the linked list based one is slow and the space requirement of the matrix based one is large. Therefore, this paper proposes a novel DAWG based on a compacted double-array, which overcomes the drawbacks of traditional ones. Experimental results show that the novel DAWG is more efficient than traditional ones.
Keywords :
data mining; directed graphs; pattern matching; text analysis; compact pattern matching machine; compacted double-array; content filtering systems; directed acyclic word graph; fast string matching; space-efficient word graphs; text mining; text-processing applications; Application specific integrated circuits; Dictionaries; Electronic mail; Field programmable gate arrays; Filtering; Handheld computers; Matched filters; Pattern matching; Space technology; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovations in Information Technology, 2008. IIT 2008. International Conference on
Conference_Location :
Al Ain
Print_ISBN :
978-1-4244-3396-4
Electronic_ISBN :
978-1-4244-3397-1
Type :
conf
DOI :
10.1109/INNOVATIONS.2008.4781726
Filename :
4781726
Link To Document :
بازگشت