شماره ركورد كنفرانس :
3297
عنوان مقاله :
Adapting Google Translate for English-Persian crosslingual information retrieval in medical domain
پديدآورندگان :
Rahmani Amin Computational linguistics department Regional information center for science and technology shira - Iran
كليدواژه :
BLEU , MAP , Google Translate API , Machine Translation System , CLIR
عنوان كنفرانس :
نوزدهمين سمپوزيوم بين المللي هوش مصنوعي و پردازش سيگنال
چكيده لاتين :
Cross-lingual information retrieval (CLIR) systems
enable users to search and find their information needs from
sources written in languages other than the user’s native language.
Generally, these systems assist users to overcome the language
barrier problem. Although, several techniques are used to develop
such systems, query translation method has absorbed much
attention due to its performance. In this paper, the author
suggested a new approach for English-Persian CLIR. To do this,
Google Translate’s API was adapted for CLIR system to translate
the queries. Using TREC dataset, 50 queries were selected to
evaluate the system. Both English queries and their Persian
equivalents were searched in RICeST’s English and Persian Earticles
databases. As black box evaluation, the researcher utilized
11 point interpolated average precision metric to gain the average
precision (AP) score for each query after which the mean average
precision measure (MAP) scores for English and Persian queries
were calculated. The MAP score for monolingual and cross-lingual
systems were 0.421 and 0.382 respectively. As glass box evaluation,
the machine translation system’s performance was measured
based on the BLEU automatic metric. According to the results of
this study, 90% similarity in IR was observed between the CLIR
and the monolingual systems. The new approach was ideally suited
for English and Persian CLIR task.