Title :
Improved retrieval of foreign names from large databases
Author :
Oshika, Beatrice T. ; Evans, Bruce ; Machi, Filip ; Tom, Janet
Author_Institution :
SPARTA Inc., Berkeley, CA, USA
Abstract :
A description is given of enhancements made to name-search techniques currently used to access large databases of proper names. Improvements included use of a hidden-Markov-model (HMM) statistical classifier to identify the likely linguistic source of a proper name, and application of language-specific rules to generate plausible spelling variants of names. These two components were incorporated into a prototype front-end system driving existing name-search procedures. Preliminary evaluation indicates improved retrieval of 20-30% as measured by number of correct items retrieved
Keywords :
Markov processes; computational linguistics; database management systems; database theory; information retrieval; foreign names; front-end system; hidden-Markov-model; language-specific rules; large databases; linguistic source; name-search techniques; retrieval; spelling variants; statistical classifier; Current measurement; Databases; Hidden Markov models; Information retrieval; Mathematics; Natural languages; Prototypes; Search problems; Sun; Testing;
Conference_Titel :
Data Engineering, 1988. Proceedings. Fourth International Conference on
Conference_Location :
Los Angeles, CA
Print_ISBN :
0-8186-0827-7
DOI :
10.1109/ICDE.1988.105494