Title :
An approach to designing very fast approximate string matching algorithms
Author :
Du, M.-W. ; Chang, S.C.
Author_Institution :
Comput. and Intelligent Syst. Lab., GTE Labs. Inc., Waltham, MA, USA
fDate :
8/1/1994 12:00:00 AM
Abstract :
An approach to designing very fast algorithms for approximate string matching in a dictionary is proposed. Multiple spelling errors corresponding to insert, delete, change, and transpose operations on character strings are considered in the fault model. The design of very fast approximate string matching algorithms through a four-step reduction procedure is described. The final and most effective step uses hashing techniques to avoid comparing the given word with words at large distances. The technique has been applied to a library book catalog textbase. The experiments show that performing approximate string matching for a large dictionary in real-time on an ordinary sequential computer under our multiple fault model is feasible
Keywords :
database theory; error correction; query processing; search problems; character strings; dictionary; error correction; fast approximate string matching algorithms; fault model; four-step reduction procedure; hashing techniques; information retrieval; large dictionary; library book catalog textbase; multiple fault model; nearest neighbor search; sequential computer; spelling errors; Algorithm design and analysis; Books; Computer applications; Computer errors; Databases; Dictionaries; Error correction; Information retrieval; Libraries; Nearest neighbor searches;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on