DocumentCode :
1607294
Title :
Learning topic knowledge to improve Chinese word sense disambiguation
Author :
Wang, Huizhen ; Zhu, Jingbo
Author_Institution :
Natural Language Process. Lab., Northeastern Univ., Shenyang, China
fYear :
2010
Firstpage :
175
Lastpage :
180
Abstract :
This paper addresses an issue of incorporating topic knowledge to improve Chinese word sense disambiguation. The key is how to learn topic knowledge as features in the design of classifiers for disambiguating word senses. This paper presents two solutions to learn topic knowledge. In the first solution, a Chinese domain knowledge dictionary named NEUKD is used to generate domain feature set. However, due to the limited coverage of the NEUKD, a constrained clustering algorithm is adopted for dictionary expansion. The second method is to build topic feature set by utilizing the Latent Dirichlet Allocation (LDA) algorithm on a large scale unlabeled corpus. Experiments on the SENSEVAL-3 Chinese dataset demonstrated that integrating topic knowledge improve the performance of Chinese word sense disambiguation.
Keywords :
dictionaries; learning (artificial intelligence); natural language processing; pattern classification; pattern clustering; Chinese domain knowledge dictionary; Chinese word sense disambiguation; NEUKD; SENSEVAL-3 Chinese dataset; classifier design; constrained clustering algorithm; domain feature set generation; latent dirichlet allocation algorithm; topic knowledge learning; Classification algorithms; Clustering algorithms; Context; Context modeling; Data models; Dictionaries; Training; Chinese word sense disambiguation; classification model; topic knowledge;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
Type :
conf
DOI :
10.1109/IUCS.2010.5666232
Filename :
5666232
Link To Document :
بازگشت