DocumentCode :
3724126
Title :
Supervised Topic Models for Microblog Classification
Author :
Saurabh Kataria;Arvind Agarwal
Author_Institution :
Palo Alto Res. Center, Webster, NY, USA
fYear :
2015
Firstpage :
793
Lastpage :
798
Abstract :
In this paper we present a topic model based approach for classifying micro-blog posts into a given topics of interests. The short nature of micro-blog posts make them challenging for directly learning a classification model. To overcome this limitation, we use content of the links embedded in these posts to improve the topic learning. The hypothesis is that since the link content is far richer than the content of the post itself, using link content along with the content of the post will help learning. However, how this link content can be used to construct features for classification remains a challenging issue. Furthermore, in previous methods, user based information is utilized in an ad-hoc manner that only work for certain type of classification, such as characterizing content of microblogs. In this paper, we propose supervised topic model, User-Labeled-LDA and its nonparametric variant that can avoid the ad-hoc feature construction task and model the topics in a discriminative way. Our experiments on a Twitter dataset shows that modeling user interests and link information helps in learning quality topics for sparse tweets as well as helps significantly in classification task. Our experiments further show that modeling this information in a principled way through topic models helps more than simply adding this information through features.
Keywords :
"Data models","Encyclopedias","Electronic publishing","Internet","Data mining","Blogs"
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2015 IEEE International Conference on
ISSN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2015.148
Filename :
7373391
Link To Document :
بازگشت