DocumentCode
3228600
Title
Automatic Identification of Chinese Weblogger´s Interests Based on Text Classification
Author
Ni, Xiaochuan ; Wu, Xiaoyuan ; Yu, Yong
Author_Institution
Shanghai Jiao Tong Univ.
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
247
Lastpage
253
Abstract
Chinese Weblogs have been expanded in an incredible speed in recent years. There is plentiful personal information in Weblogs. In this paper, we propose a text classification based approach to automatically identify the interests of a Weblogger. To solve the problems arising out of class Weblog documents, the technique of heterogeneous classifiers combination is used. We also use hierarchical classification technique to identify much specific interests. Experiments show that our interest identification approach has a high accuracy and, for most Webloggers in our experiments, their interests implied in the contents of blogs could be well identified by using this approach
Keywords
Web sites; text analysis; Chinese Weblog document; hierarchical text classification technique; Blogs; Filtering; Support vector machine classification; Support vector machines; Text categorization; Voting;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
Conference_Location
Hong Kong
Print_ISBN
0-7695-2747-7
Type
conf
DOI
10.1109/WI.2006.47
Filename
4061373
Link To Document