DocumentCode :
1905336
Title :
Semantic Cross-lingual Information Retrieval
Author :
Pourmahmoud, Solmaz ; Shamsfard, Mehrnoush
Author_Institution :
Technol. & Eng. Dept., Islamic Azad Univ. Khoy Branch, Khoy
fYear :
2008
fDate :
27-29 Oct. 2008
Firstpage :
1
Lastpage :
4
Abstract :
Cross lingual Information Retrieval (CLIR) refers to the information retrieval activities in which the query and/or documents may appear in different languages. Dictionary-based query translation has been a common method in CLIR systems. In these methods we face with the problem of translation ambiguity in which a single word in one language has more than one translation in the other language. In this paper we propose a hybrid approach to retrieve English documents relevant to Persian queries. In this approach we exploit a combination of phrase reorganization, pattern based phrase translation and query expansion before and after translation to improve the dictionary-based query translation. We also propose an improved probabilistic algorithm to choose the best translation of words and phrases. Finally, the documents will be ranked according to statistical language model with some translation steps. Our experimental results show that each of the mentioned methods can bring significant improvement over simple dictionary approaches.
Keywords :
dictionaries; document handling; information retrieval systems; language translation; natural languages; probability; query formulation; CLIR system; English document relevant retrieval; Persian query expansion; dictionary-based query translation; phrase translation; probabilistic algorithm; semantic cross-lingual information retrieval; statistical language model; Dictionaries; Information retrieval; Natural languages; Statistics; Uncertainty; Cross language information retrieval; Language mode; Query expansion; Semantic ranking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Sciences, 2008. ISCIS '08. 23rd International Symposium on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-2880-9
Electronic_ISBN :
978-1-4244-2881-6
Type :
conf
DOI :
10.1109/ISCIS.2008.4717868
Filename :
4717868
Link To Document :
بازگشت