DocumentCode :
480701
Title :
A Wavelet-Based Model to Recognize High-Quality Topics on Web Forum
Author :
Chen, You ; Cheng, Xue-Qi ; Huang, Yu-Lan
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing
Volume :
1
fYear :
2008
fDate :
9-12 Dec. 2008
Firstpage :
343
Lastpage :
351
Abstract :
Web forum has become an important resource on the Web due to its rich information contributed by millions of Internet users every day. Meanwhile, thousands of junk or valueless messages exist in Web forum. Recognizing high-quality topics should be fundamental tasks in search engine and Web mining systems. However, it is not a trivial problem to quantify high-quality topics on web forum. Users face a daunting challenge in identifying a small subset of topics worthy of their attention. In this paper, we present several characteristics to measure high-quality topic, based on these characteristics, we propose a novel model to recognize high-quality topics on Web forum. Our model consists of three steps. First, time series signals which contain distinctive characteristics between high-quality topics and non-high-quality topics are extracted from topics. Second, features are obtained from signals by using wavelet packet transform (WPT). Third, upon the features, high-quality topics are recognized by using backpropagation neural network. Conducting experiments on Tencent Message Boards which have 2,710,994 messages and 189,962 authors ranging from Jan 1, 2005 to Nov 12, 2007, we demonstrate the efficiency of our model, showing that the average accuracy rate of high-quality topic recognition is 95% and nearly 50,000 topics can be recognized in one second.
Keywords :
Internet; backpropagation; data mining; neural nets; search engines; social sciences computing; Internet users; Tencent Message Boards; Web forum; Web mining systems; backpropagation neural network; search engine; wavelet-based model; Character recognition; Computers; Data mining; Discussion forums; Intelligent agent; Neural networks; Search engines; Wavelet packets; Wavelet transforms; Web mining; Feature Extraction; High-Quality topic; Wavelet Packet Transform; Web Forum;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
Type :
conf
DOI :
10.1109/WIIAT.2008.17
Filename :
4740470
Link To Document :
بازگشت