DocumentCode :
3277799
Title :
A CLIR-oriented OOV translation mining method from bilingual webpages
Author :
Liu, Lan ; Ge, Yun-dong ; Yan, Zhen-xiang ; Yao, Jian-min
Author_Institution :
Inf. Suzhou Key Lab. on Inf. Process., Soochow Univ., Suzhou, China
Volume :
4
fYear :
2011
fDate :
10-13 July 2011
Firstpage :
1872
Lastpage :
1877
Abstract :
Translating unknown terms is a major bottleneck for cross-language IR. An effective solution to relevant webpage detection, translation extraction with correct boundaries, and candidate translation ranking is proposed. Topic word translations are used to expand the source query and collect bilingual search engine snippets. Then an improved Frequency Change Measurement method is used to extract valid candidates from noisy, small bilingual corpora. To choose the translation, frequency-distance, surface patterns and phonetic features are used to pick out the correct translation. Experimental results show an impressive performance for unknown term translation mining.
Keywords :
Internet; data mining; natural language processing; CLIR oriented OOV translation mining method; bilingual search engine snippets; bilingual webpages; frequency change measurement method; translation extraction; translation ranking; webpage detection; word translations; Cybernetics; Data mining; Frequency measurement; Machine learning; Noise; Pattern matching; Search engines; Cross-language IR; Search engine; Web mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
Conference_Location :
Guilin
ISSN :
2160-133X
Print_ISBN :
978-1-4577-0305-8
Type :
conf
DOI :
10.1109/ICMLC.2011.6016958
Filename :
6016958
Link To Document :
بازگشت