DocumentCode
547360
Title
Research on algorithm of Chinese BBS topic detection based on content analysis
Author
Nie Zhe
Author_Institution
Sch. of Electron. & Inf. Eng., Shenzhen Polytech., Shenzhen, China
Volume
3
fYear
2011
fDate
10-12 June 2011
Firstpage
512
Lastpage
516
Abstract
Through analyzing and studying the BBS topic model, topic similarity, topic inspection, topic evaluation standards and topic developing trends, This paper designs and implements the Chinese BBS topic detection algorithm based on the content analysis, which includes obtaining BBS information by web crawler, processing BBS information based on the URL and Xpath page templates, realizing BBS information participle by ICTLAS, clustering BBS topic by Carrot2, analyzing hot topic based on the power spectrum and predicting of BBS topic based on time series. Finally, this paper developed the Chinese BBS Topic detection system used J2EE development kit, based on the eclipse integrated development environment, combined with Hibernate and GWT technology, and getting good results by tested in various BBS forums.
Keywords
Internet; Java; information retrieval; pattern clustering; time series; BBS topic clustering; Carrot2; Chinese BBS topic detection algorithm; GWT technology; Google Web Toolkits; Hibernate techology; ICTLAS; J2EE development kit; URL; Web crawler; Xpath page template; content analysis; eclipse integrated development environment; time series prediction; Data mining; Data models; Databases; Java; Predictive models; Time series analysis; Web pages; BBS topic detection; Web crawler; algorithm; hot spot; topic clustering analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-8727-1
Type
conf
DOI
10.1109/CSAE.2011.5952730
Filename
5952730
Link To Document