DocumentCode :
3466682
Title :
Clustering Using Feature Domain Similarity to Discover Word Senses for Adjectives
Author :
Tomuro, Noriko ; Lytinen, Steven L. ; Kanzaki, Kyoko ; Isahara, Hitoshi
Author_Institution :
DePaul Univ., Chicago
fYear :
2007
fDate :
17-19 Sept. 2007
Firstpage :
370
Lastpage :
377
Abstract :
This paper presents a new clustering algorithm called DSCBC which is designed to automatically discover word senses for polysemous words. DSCBC is an extension of CBC clustering (P. Pantel and D. Lin, 2002), and incorporates feature domain similarity: the similarity between the features themselves, obtained a priori from sources external to the dataset used at hand. When polysemous words are clustered, words that have similar sense patterns are often grouped together, producing polysemous clusters: a cluster in which features in several different domains are mixed in. By incorporating the feature domain similarity in clustering, DSCBC produces monosemous clusters, thereby discovering individual senses of polysemous words. In this work, we apply the algorithm to English adjectives, and compare the discovered senses against WordNet. The results show significant improvements by our algorithm over other clustering algorithms including CBC.
Keywords :
natural language processing; pattern clustering; DSCBC; English adjectives; clustering algorithm; feature domain similarity; polysemous word clustering; word senses discovering; Algorithm design and analysis; Clustering algorithms; Communications technology; Computer science; Concrete; Frequency; Information systems; Natural language processing; Telecommunication computing; Thesauri;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing, 2007. ICSC 2007. International Conference on
Conference_Location :
Irvine, CA
Print_ISBN :
978-0-7695-2997-4
Type :
conf
DOI :
10.1109/ICSC.2007.72
Filename :
4338371
Link To Document :
بازگشت