DocumentCode :
389418
Title :
Generating transliteration rules for cross-language information retrieval from machine translation dictionaries
Author :
Sakai, Tetsuya ; Kumano, Akira ; Manabe, Toshihiko
Author_Institution :
Corp. Res. Dev. Center, Toshiba Corp., Kawasaki, Japan
Volume :
6
fYear :
2002
fDate :
6-9 Oct. 2002
Abstract :
This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.
Keywords :
information retrieval; language translation; English-Japanese transliteration rules; Japanese-English back-transliteration rules; cross-language information retrieval; machine translation dictionaries; synonyms; transliteration rules; Counting circuits; Dictionaries; Information retrieval; Laboratories; Natural languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2002 IEEE International Conference on
ISSN :
1062-922X
Print_ISBN :
0-7803-7437-1
Type :
conf
DOI :
10.1109/ICSMC.2002.1175601
Filename :
1175601
Link To Document :
بازگشت