DocumentCode :
2114086
Title :
A Chinese unsupervised word sense disambiguation method based on semantic vector
Author :
Lei Cui ; Xinfu Li ; Danqing Wang
Author_Institution :
Coll. of Math. & Comput. Sci., Hebei Univ., Baoding, China
fYear :
2012
fDate :
21-23 April 2012
Firstpage :
3009
Lastpage :
3012
Abstract :
The supervise machine learning word sense disambiguation method need to annotate the words of the training corpus, in order to overcome the data sparseness problem to achieve the good word sense disambiguation effect we must establish a large-scale marked Corpus, but obtaining the marked corpus requires high artificial price. Against this problem this paper proposes an unsupervised learning method without manual annotation. Firstly we mine the feature words based on PMI (Point-wise Mutual Information) and Z test, defining the v words to describe a certain sense of polysemy, and then calculating the similarity between sense words and the features of polysemy in the context to determine the correct sense of the polysemy. This paper disambiguates ten typical polysemy, and experimental results prove that the method is effective.
Keywords :
data mining; natural language processing; programming language semantics; unsupervised learning; word processing; Chinese unsupervised word sense disambiguation; PMI; Z test; data sparseness problem; feature word mining; marked corpus; point wise mutual information; polysemy; semantic vector; similarity calculation; supervise machine learning; training corpus; unsupervised learning method; v word; word annotation; Clustering algorithms; Context; Dictionaries; Educational institutions; Learning systems; Semantics; Vectors; PMI; semantic vector; similarity; unsupervised learning; word sense disambiguation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on
Conference_Location :
Yichang
Print_ISBN :
978-1-4577-1414-6
Type :
conf
DOI :
10.1109/CECNet.2012.6201527
Filename :
6201527
Link To Document :
بازگشت