Title :
Semi-automatic indexing of documents with a multilingual thesaurus
Author :
Schiel, Ulrich ; De Sousa, Lanna M S F
Author_Institution :
Univ. Fed. de Campina Grande, Brazil
Abstract :
With the growing significance of digital libraries and the Internet, more and more electronic texts become accessible to a wide and geographically disperse public. This requires adequate tools to facilitate indexing, storage, and retrieval of documents written in different languages. We present a method for semi-automatic indexing of electronic documents and construction of a multilingual thesaurus, which can be used for query formulation and information retrieval. We use special dictionaries and user interaction in order to solve ambiguities and find adequate canonical terms in the language and an adequate abstract language-independent term. The abstract thesaurus is updated incrementally by new indexed documents and is used to search for documents using adequate terms.
Keywords :
dictionaries; digital libraries; indexing; information retrieval; natural language interfaces; thesauri; Internet; abstract language-independence; digital libraries; document indexing; document retrieval; document storage; electronic documents; electronic texts; information retrieval; multilingual thesaurus; query formulation; semiautomatic indexing; special dictionaries; user interaction; Asia; Data mining; Dictionaries; Information retrieval; Internet; Machine assisted indexing; Natural languages; Software libraries; Thesauri; Web sites;
Conference_Titel :
Research Issues in Data Engineering: Multi-lingual Information Management, 2003. RIDE-MLIM 2003. Proceedings. 13th International Workshop on
Print_ISBN :
0-7803-7868-7
DOI :
10.1109/RIDE.2003.1249843