DocumentCode
545347
Title
Machine learning based blog classification personal vs. official facet
Author
Sun, Xueji ; Li, Si ; Xu, Weiran ; Chen, Guang ; Guo, Jun
Author_Institution
Sch. of Inf. & Commun. Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
Volume
1
fYear
2011
fDate
11-13 March 2011
Firstpage
31
Lastpage
34
Abstract
Since the blog service brings a wealth of information resources, blog search and classification are showing their great research value. This paper focuses on the blog classification on the personal vs. official facet. Our system adopts a two-stage strategy; in training model, lexicons are built automatically; in classification model, scoring and ranking are carried out orderly. Our experimental results reveal that feature selection, Mutual Information weighting are good for lexicons with significant results. However, sentiment words can only slightly improve the results.
Keywords
Web services; Web sites; feature extraction; learning (artificial intelligence); pattern classification; blog classification; blog search; blog service; classification model; feature selection; information resource; machine learning; mutual information weighting; sentiment word; training model; Blogs; Buildings; Machine learning; Measurement; Mutual information; Testing; Training; Blog Classification; Feature selection; Lexicons; Machine Learning; Sentiment;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Research and Development (ICCRD), 2011 3rd International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-839-6
Type
conf
DOI
10.1109/ICCRD.2011.5763967
Filename
5763967
Link To Document