Title :
Multi-Term Keywords for Indexing Multilingual Textual Repositories: Developing Language Resources and Algorithms.
Author :
Pammzi, A. ; Fabbri, Marco ; Moneglia, Massimo ; Zini, Manuel
Author_Institution :
Italian Dept., Universita di Firenze
Abstract :
The tool for keyword extraction developed within the AXMEDIS project have been designed for working in a multilingual environment and new algorithms have been developed to generate keywords with higher representativeness for content search and identification. The paper specifies the linguistic criteria followed for building language resources for French, Italian, and German, which are comparable to those available for English and the algorithms used for enhancing keyword performance. The new algorithm extracts keyword collocations from the document and generates multiword keywords. The use of multi-term descriptors is a good means to better identify the content. The precision obtained with respect to mono-term keyword increase the performance of 100% relative factor
Keywords :
classification; content management; indexing; information retrieval; natural languages; AXMEDIS project; English language; French language; German language; Italian language; content identification; content search; indexing; keyword extraction; multilingual environment; multilingual textual repositories; multiterm keywords; Algorithm design and analysis; Buildings; Computational linguistics; Databases; Frequency; Indexing; MONOS devices; Natural languages; Performance analysis; Statistics;
Conference_Titel :
Automated Production of Cross Media Content for Multi-Channel Distribution, 2006. AXMEDIS '06. Second International Conference on
Conference_Location :
Leeds
Print_ISBN :
0-7695-2625-X
DOI :
10.1109/AXMEDIS.2006.36