DocumentCode :
124245
Title :
Word Sense Induction with Multilingual Features Representation
Author :
Albano, Luca ; Beneventano, Domenico ; Bergamaschi, Sonia
Author_Institution :
DIEF, Univ. of Modena & Reggio Emilia, Modena, Italy
Volume :
2
fYear :
2014
fDate :
11-14 Aug. 2014
Firstpage :
343
Lastpage :
349
Abstract :
The use of word senses in place of surface word forms has been shown to improve performance on many computational tasks, including intelligent web search. In this paper we propose a novel approach to automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). Almost all the WSI approaches proposed in the literature dealt with monolingual data and only very few proposals incorporate bilingual data. The WSI method we propose is innovative as use multi-lingual data to perform WSI of words in a given language. The experiments show a clear overall improvement of the performance: the single-language setting is outperformed by the multi-language settings on almost all the considered target words. The performance gain, in terms of F-Measure, has an average value of 5% and in some cases it reaches 40%.
Keywords :
natural language processing; pattern clustering; text analysis; word processing; F-measure; WSI; context clustering; multilingual data; multilingual feature representation; word sense induction; Clustering algorithms; Context; Noise; Performance gain; Testing; Training; Vectors; Clustering; Multilingual; Web Search; Word Sense Disambiguation; Word Sense Induction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Warsaw
Type :
conf
DOI :
10.1109/WI-IAT.2014.117
Filename :
6927644
Link To Document :
بازگشت