Title :
Document similarity measure for topic detection in BBS
Author :
Zhang Zhonghui ; Wu Bin
Author_Institution :
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
Document similarity calculation methods are closely related to specific applications. Document-similarity based topic detection in BBS needs to solve two problems: first, to highlight words which are rich in topic information; second, to overcome the adverse impact of huge variance in text length among BBS texts. This paper proposed a novel approach to address the issues: First, the features are divided into five categories: persons (includes organizations), locations, nouns, verbs, others; second, features are selected in each category respectively. Experiments show that the approach yield significant improvement over the traditional way.
Keywords :
information resources; BBS; Google News alert; document similarity measure; topic detection; Computational modeling; Event detection; Feature extraction; Organizations; Presses; Semantics; Training; BBS; document similarity; event detection; feature selection;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
DOI :
10.1109/FSKD.2010.5569864