• DocumentCode
    2043038
  • Title

    Document similarity measure for topic detection in BBS

  • Author

    Zhang Zhonghui ; Wu Bin

  • Author_Institution
    Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    5
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    2354
  • Lastpage
    2357
  • Abstract
    Document similarity calculation methods are closely related to specific applications. Document-similarity based topic detection in BBS needs to solve two problems: first, to highlight words which are rich in topic information; second, to overcome the adverse impact of huge variance in text length among BBS texts. This paper proposed a novel approach to address the issues: First, the features are divided into five categories: persons (includes organizations), locations, nouns, verbs, others; second, features are selected in each category respectively. Experiments show that the approach yield significant improvement over the traditional way.
  • Keywords
    information resources; BBS; Google News alert; document similarity measure; topic detection; Computational modeling; Event detection; Feature extraction; Organizations; Presses; Semantics; Training; BBS; document similarity; event detection; feature selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569864
  • Filename
    5569864