Title :
The String-to-Dictionary Matching Problem
Author :
Klein, Shmuel T. ; Shapira, Dana
Author_Institution :
Dept. of Comput. Sci., Bar Ilan Univ., Ramat Gan, Israel
Abstract :
The String-to-Dictionary Matching Problem is defined, in which a string is searched for in all the possible concatenations of the elements of a given dictionary, with applications to compressed matching in variable to fixed length encodings, such as Tunstall´s. An algorithm based on suffix trees is suggested and experiments on natural language text are presented suggesting that compressed search might use less comparisons for long enough patterns, in spite of a potentially large number of encodings.
Keywords :
data compression; dictionaries; encoding; information retrieval; natural language processing; string matching; compressed matching; fixed length encoding; natural language text; string searching; string-to-dictionary matching problem; Computer science; Data compression; Decoding; Dictionaries; Encoding; Indexes; Pattern matching; Compressed Matching; Suffix Trees; Tunstall;
Conference_Titel :
Data Compression Conference (DCC), 2011
Conference_Location :
Snowbird, UT
Print_ISBN :
978-1-61284-279-0
DOI :
10.1109/DCC.2011.21