DocumentCode :
2352712
Title :
Determining Gender of Korean Names with Context
Author :
Yoon, Hee-Geun ; Park, Seong-Bae ; Han, Yong-Jin ; Lee, Sang-Jo
Author_Institution :
Dept. of Comput. Eng., Kyungpook Nat. Univ., Daegu
fYear :
2008
fDate :
23-25 July 2008
Firstpage :
121
Lastpage :
126
Abstract :
Machine translation systems have various problems although they have been developed continuously. Especially, in Korean-English translation system, zero pronoun problem is an important problem, since omitted subject or object Korean are must be restored in English. In order to solve this problem, various methods have been proposed. In this paper, we focus on the gender determination problem in Korean names as a first-step for solving a zero pronoun problem in Korean. Since this problem can be viewed as a binary classification problem, we adopt support vector machines which are well-known for solving binary classification. The bag-of-words model is used to represent a name with context as a vector and information entropy of words is adopted for selecting features. An evaluation of the proposed method shows about 86% of accuracy. This method achieves higher accuracy than baseline which determines the gender of a name by its majority and additionally resolves the limitation of memory based and statistical method which use only names.
Keywords :
language translation; natural language processing; support vector machines; Korean names; Korean-English translation system; bag-of-words model; binary classification problem; gender determination problem; information entropy; machine translation system; support vector machine; zero pronoun problem; Context modeling; Humans; Information entropy; Information technology; Motion pictures; Natural languages; Statistical analysis; Support vector machine classification; Support vector machines; Determine gender; svm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
Conference_Location :
Dalian Liaoning
Print_ISBN :
978-0-7695-3273-8
Type :
conf
DOI :
10.1109/ALPIT.2008.86
Filename :
4584353
Link To Document :
بازگشت