DocumentCode :
2733177
Title :
Mining Translations of Chinese Names from Web Corpora Using a Query Expansion Technique and Support Vector Machine
Author :
Yang, Kai-Hsiang ; Chen, Wei-Da ; Lee, Hahn-Ming ; Ho, Jan-Ming
Author_Institution :
Acad. Sinica, Taipei
fYear :
2007
fDate :
5-12 Nov. 2007
Firstpage :
530
Lastpage :
533
Abstract :
Chinese name translation is a special case of the problem of named entity translation. It is a very challenging problem because there exist many kinds of Romanization systems and some people like to add additional words into their english names. Translating a scholar´s name to its corresponding English name could help find information about his academic achievements. In this paper, we provide a classification for Chinese names, and propose a novel approach to mining Chinese name translations from Web corpora. Our approach is based on three kinds of features, namely the phonetic similarity, the smallest distance, and the number of appearances in the neighborhood, to extract name translation candidates by using a query expansion technique and support vector machine (SVM). Experimental results show that our approach can correctly translate the majority of Chinese names.
Keywords :
classification; data mining; language translation; natural languages; query formulation; support vector machines; Chinese name classification; Chinese name translation mining; Web corpora; name translation candidate extraction; named entity translation; neighborhood appearances; phonetic similarity; query expansion technique; smallest distance; support vector machine; Chaos; Computer science; Conferences; Information science; Intelligent agent; Internet; Machine intelligence; Search engines; Support vector machines; Web mining; Web miningquery expansionSVMname translation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology Workshops, 2007 IEEE/WIC/ACM International Conferences on
Conference_Location :
Silicon Valley, CA
Print_ISBN :
0-7695-3028-1
Type :
conf
DOI :
10.1109/WI-IATW.2007.64
Filename :
4427644
Link To Document :
بازگشت