Title :
Comparing Different Approaches to Treat Translation Ambiguity in CLIR: Structured Queries vs. Target Co-occurrence Based Selection
Author :
Saralegi, Xabier ; Lopez de Lacalle, M.
Author_Institution :
R&D Elhuyar Found., Usurbil, Spain
fDate :
Aug. 31 2009-Sept. 4 2009
Abstract :
Two main problems in Cross-language Information Retrieval are translation selection and the treatment of out-of-vocabulary terms. In this paper, we will be focusing on the problem concerning the translation selection. Structured queries and target co-occurrence-based methods seem to be the most appropriate approaches when parallel corpora are not available. However, there is no comparative study. In this paper we compare the results obtained using each of the aforementioned methods, we specify the weaknesses of each method, and finally we propose a hybrid method to combine both. In terms of mean average precision, results for Basque-English cross-lingual retrieval show that structured queries are the best approach both with long queries and short queries.
Keywords :
language translation; natural language processing; query processing; Basque-English cross-lingual retrieval; cross-language information retrieval; mean average precision; out-of-vocabulary terms; structured queries; target co-occurrence based selection; translation selection; Algorithm design and analysis; Context; Databases; Dictionaries; Expert systems; Information retrieval; Natural languages; Research and development; Statistics; Crosslingual information retrieval; co-occurrence statistics; structured query translation;
Conference_Titel :
Database and Expert Systems Application, 2009. DEXA '09. 20th International Workshop on
Conference_Location :
Linz
Print_ISBN :
978-0-7695-3763-4
DOI :
10.1109/DEXA.2009.58