DocumentCode :
2319452
Title :
Exploring multiple features for sense prediction of Chinese unknown words
Author :
Wang, Chao-yue ; Zhao, Yan-qing ; Fu, Guo-hong
Author_Institution :
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
Volume :
5
fYear :
2012
fDate :
15-17 July 2012
Firstpage :
2031
Lastpage :
2036
Abstract :
Word sense disambiguation is a crucial problem in natural language processing. While sense disambiguation of in-vocabulary words is well studied to date, few research findings are yet available concerning the prediction of unknown words´ sense. In this paper, we attempt to exploit multiple features for predicting sense of Chinese out-of-vocabulary words in real text. To this end, we first take morpheme as the basic component units of Chinese words and thus investigate the relationship between Chinese unknown words´ senses and their internal morphological structures. Then, we explore both word internal cues and word external contextual features, and combine them for sense prediction of Chinese unknown words using maximum entropy modeling. Our experimental results show that the incorporation of multiple features, especially the word-internal morphological features are of great value to Chinese unknown word sense prediction.
Keywords :
natural language processing; text analysis; vocabulary; Chinese out-of-vocabulary words; Chinese unknown words; internal morphological structures; maximum entropy modeling; multiple features; natural language processing; sense prediction; word sense disambiguation; word-internal morphological features; Abstracts; Maximum entropy models; Morpheme features; Sense prediction; Word sense disambiguation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2012 International Conference on
Conference_Location :
Xian
ISSN :
2160-133X
Print_ISBN :
978-1-4673-1484-8
Type :
conf
DOI :
10.1109/ICMLC.2012.6359688
Filename :
6359688
Link To Document :
بازگشت