مرکز منطقه ای اطلاع رساني علوم و فناوري - Clustering Using Feature Domain Similarity to Discover Word Senses for Adjectives

DocumentCode :

3466682

Title :

Clustering Using Feature Domain Similarity to Discover Word Senses for Adjectives

Author :

Tomuro, Noriko ; Lytinen, Steven L. ; Kanzaki, Kyoko ; Isahara, Hitoshi

Author_Institution :

DePaul Univ., Chicago

fYear :

2007

fDate :

17-19 Sept. 2007

Firstpage :

370

Lastpage :

377

Abstract :

This paper presents a new clustering algorithm called DSCBC which is designed to automatically discover word senses for polysemous words. DSCBC is an extension of CBC clustering (P. Pantel and D. Lin, 2002), and incorporates feature domain similarity: the similarity between the features themselves, obtained a priori from sources external to the dataset used at hand. When polysemous words are clustered, words that have similar sense patterns are often grouped together, producing polysemous clusters: a cluster in which features in several different domains are mixed in. By incorporating the feature domain similarity in clustering, DSCBC produces monosemous clusters, thereby discovering individual senses of polysemous words. In this work, we apply the algorithm to English adjectives, and compare the discovered senses against WordNet. The results show significant improvements by our algorithm over other clustering algorithms including CBC.

Keywords :

natural language processing; pattern clustering; DSCBC; English adjectives; clustering algorithm; feature domain similarity; polysemous word clustering; word senses discovering; Algorithm design and analysis; Clustering algorithms; Communications technology; Computer science; Concrete; Frequency; Information systems; Natural language processing; Telecommunication computing; Thesauri;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Semantic Computing, 2007. ICSC 2007. International Conference on

Conference_Location :

Irvine, CA

Print_ISBN :

978-0-7695-2997-4

Type :

conf

DOI :

10.1109/ICSC.2007.72

Filename :

4338371

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3466682