Title :
Consensus Similarity Measure for Short Text Clustering
Author :
Youhyun Shin;Yeonchan Ahn;Heesik Jeon;Sang-goo Lee
Author_Institution :
Sch. of Comput. Sci. &
Abstract :
Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering.
Keywords :
"Semantics","Batteries","Knowledge based systems","Context","Natural language processing","Length measurement","Taxonomy"
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2015 26th International Workshop on
Print_ISBN :
978-1-4673-7581-8
Electronic_ISBN :
2378-3915
DOI :
10.1109/DEXA.2015.65