DocumentCode
2043038
Title
Document similarity measure for topic detection in BBS
Author
Zhang Zhonghui ; Wu Bin
Author_Institution
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Volume
5
fYear
2010
fDate
10-12 Aug. 2010
Firstpage
2354
Lastpage
2357
Abstract
Document similarity calculation methods are closely related to specific applications. Document-similarity based topic detection in BBS needs to solve two problems: first, to highlight words which are rich in topic information; second, to overcome the adverse impact of huge variance in text length among BBS texts. This paper proposed a novel approach to address the issues: First, the features are divided into five categories: persons (includes organizations), locations, nouns, verbs, others; second, features are selected in each category respectively. Experiments show that the approach yield significant improvement over the traditional way.
Keywords
information resources; BBS; Google News alert; document similarity measure; topic detection; Computational modeling; Event detection; Feature extraction; Organizations; Presses; Semantics; Training; BBS; document similarity; event detection; feature selection;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location
Yantai, Shandong
Print_ISBN
978-1-4244-5931-5
Type
conf
DOI
10.1109/FSKD.2010.5569864
Filename
5569864
Link To Document