• DocumentCode
    547360
  • Title

    Research on algorithm of Chinese BBS topic detection based on content analysis

  • Author

    Nie Zhe

  • Author_Institution
    Sch. of Electron. & Inf. Eng., Shenzhen Polytech., Shenzhen, China
  • Volume
    3
  • fYear
    2011
  • fDate
    10-12 June 2011
  • Firstpage
    512
  • Lastpage
    516
  • Abstract
    Through analyzing and studying the BBS topic model, topic similarity, topic inspection, topic evaluation standards and topic developing trends, This paper designs and implements the Chinese BBS topic detection algorithm based on the content analysis, which includes obtaining BBS information by web crawler, processing BBS information based on the URL and Xpath page templates, realizing BBS information participle by ICTLAS, clustering BBS topic by Carrot2, analyzing hot topic based on the power spectrum and predicting of BBS topic based on time series. Finally, this paper developed the Chinese BBS Topic detection system used J2EE development kit, based on the eclipse integrated development environment, combined with Hibernate and GWT technology, and getting good results by tested in various BBS forums.
  • Keywords
    Internet; Java; information retrieval; pattern clustering; time series; BBS topic clustering; Carrot2; Chinese BBS topic detection algorithm; GWT technology; Google Web Toolkits; Hibernate techology; ICTLAS; J2EE development kit; URL; Web crawler; Xpath page template; content analysis; eclipse integrated development environment; time series prediction; Data mining; Data models; Databases; Java; Predictive models; Time series analysis; Web pages; BBS topic detection; Web crawler; algorithm; hot spot; topic clustering analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-8727-1
  • Type

    conf

  • DOI
    10.1109/CSAE.2011.5952730
  • Filename
    5952730