DocumentCode :
3316736
Title :
Word sense disambiguation using multi-engine collaborative boostrapping
Author :
Duan, Jianyong ; Wu, Weilin ; Hu, Yi ; Chen, Yuquan ; Lu, Ruzhan
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., China
fYear :
2005
fDate :
30 Oct.-1 Nov. 2005
Firstpage :
20
Lastpage :
25
Abstract :
In this paper we proposed a new word sense disambiguation method, called multi-engine collaborative bootstrapping (MCB) that combines different types of corpora and also uses two languages to bootstrapping. MCB contains the bilingual bootstrapping as its kernel algorithm that leads to incremental knowledge acquisition. EM model is performed to train parameters in base learner. Feature translation model is improved by semantic correlation estimation. In addition we use multi-engine to produce qualified starting seeds from parallel corpora and monolingual corpora. Those seeds that are generated through unsupervised machine learning approaches can also ensure bootstrapping effectiveness in contrast with manual selected seeds in spite of their different selection mechanisms. Experimental results prove the effectiveness of MCB. Some factors including feature space and starting seed number are concerned in our experiments because EM algorithm is sensible to starting values. Limitation of resources is also concerned.
Keywords :
computational linguistics; knowledge acquisition; natural languages; unsupervised learning; EM model; MCB; bilingual bootstrapping; feature translation model; incremental knowledge acquisition; multiengine collaborative bootstrapping; semantic correlation estimation; unsupervised machine learning; word sense disambiguation; Collaboration; Computer science; Information retrieval; Kernel; Knowledge acquisition; Machine learning; Machine learning algorithms; Natural language processing; Supervised learning; Tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
Type :
conf
DOI :
10.1109/NLPKE.2005.1598700
Filename :
1598700
Link To Document :
بازگشت