DocumentCode :
2352853
Title :
Semi-automatic indexing of documents with a multilingual thesaurus
Author :
Schiel, Ulrich ; De Sousa, Lanna M S F
Author_Institution :
Univ. Fed. de Campina Grande, Brazil
fYear :
2003
fDate :
10-11 March 2003
Firstpage :
31
Lastpage :
38
Abstract :
With the growing significance of digital libraries and the Internet, more and more electronic texts become accessible to a wide and geographically disperse public. This requires adequate tools to facilitate indexing, storage, and retrieval of documents written in different languages. We present a method for semi-automatic indexing of electronic documents and construction of a multilingual thesaurus, which can be used for query formulation and information retrieval. We use special dictionaries and user interaction in order to solve ambiguities and find adequate canonical terms in the language and an adequate abstract language-independent term. The abstract thesaurus is updated incrementally by new indexed documents and is used to search for documents using adequate terms.
Keywords :
dictionaries; digital libraries; indexing; information retrieval; natural language interfaces; thesauri; Internet; abstract language-independence; digital libraries; document indexing; document retrieval; document storage; electronic documents; electronic texts; information retrieval; multilingual thesaurus; query formulation; semiautomatic indexing; special dictionaries; user interaction; Asia; Data mining; Dictionaries; Information retrieval; Internet; Machine assisted indexing; Natural languages; Software libraries; Thesauri; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research Issues in Data Engineering: Multi-lingual Information Management, 2003. RIDE-MLIM 2003. Proceedings. 13th International Workshop on
ISSN :
1066-1395
Print_ISBN :
0-7803-7868-7
Type :
conf
DOI :
10.1109/RIDE.2003.1249843
Filename :
1249843
Link To Document :
بازگشت