Title : 
Exploiting external knowledge sources to improve kernel-based Word Sense Disambiguation
         
        
            Author : 
Jin, Peng ; Li, Fuxin ; Zhu, Danqing ; Wu, Yunfang ; Yu, Shiwen
         
        
            Author_Institution : 
Inst. of Comput. Linguistics, Peking Univ., Beijing
         
        
        
        
        
        
            Abstract : 
This paper proposes a novel approach to improve the kernel-based word sense disambiguation (WSD). We first explain why linear kernels are more suitable to WSD and many other natural language processing problems than translation-invariant kernels. Based on the linear kernel, two external knowledge sources are integrated. One comprises a set of linguistic rules to find the crucial features. For the other, a distributional similarity thesaurus is used to alleviate data sparseness by generalizing crucial features when they do not match the word-form exactly. The experiments show that we have outperformed the state-of-the-art system on the benchmark data from English lexical sample task of SemEval-2007 and the improvement is statistically significant.
         
        
            Keywords : 
linguistics; natural language processing; support vector machines; thesauri; English lexical sample task; SemEval-2007; data sparseness; distributional similarity thesaurus; external knowledge sources; kernel-based word sense disambiguation; linear kernels; linguistic rules; natural language processing problems; support vector machine; Automation; Computational linguistics; Entropy; Kernel; Learning systems; Machine learning; Natural language processing; Support vector machines; Thesauri; Training data; kernel based method; support vector machine; word sense disambiguation;
         
        
        
        
            Conference_Titel : 
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
         
        
            Conference_Location : 
Beijing
         
        
            Print_ISBN : 
978-1-4244-4515-8
         
        
            Electronic_ISBN : 
978-1-4244-2780-2
         
        
        
            DOI : 
10.1109/NLPKE.2008.4906810