DocumentCode
480695
Title
Linked Topic and Interest Model for Web Forums
Author
Cheng, Victor ; Li, C.H.
Author_Institution
Dept. of Comput. Sci., Hong Kong Baptist Univ., Kowloon
Volume
1
fYear
2008
fDate
9-12 Dec. 2008
Firstpage
279
Lastpage
284
Abstract
In Web forum analysis, both the discussion topics and author interests are greatly concerned. We introduce a linked topic and interest model based on latent Dirichlet allocation (LDA) to explore discussion topics and author interests. Rather than having two separate models or modeling combined topics and interests with just one hidden topic assignment variable, the proposed model has separate but linked hidden variables for topic and interest exploration. As exact model parameter inference is intractable, Gibbs sampling is employed to estimate topic, author, and interest distributions. The joint distribution of the linked hidden variables also provides an interpretation of an interest in terms of weighted topics or vice versa. We apply the model to a NIPS data set and a corpus containing text contents of a popular digital camera Web forum. Topics and interests discovered by using the model is demonstrated. The model generalization capability is also assessed by means of perplexity and the results show that the linked topic and interest model has performance exceeding that of LDA document topic model and author topic model.
Keywords
Internet; sampling methods; text analysis; Gibbs sampling; Web forum analysis; author interests model; discussion topic; latent Dirichlet allocation; model generalization capability; Computer science; Content based retrieval; Digital cameras; Indexing; Information analysis; Information retrieval; Intelligent agent; Large scale integration; Linear discriminant analysis; Sampling methods; LDA; topic model; user interest; web forums;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location
Sydney, NSW
Print_ISBN
978-0-7695-3496-1
Type
conf
DOI
10.1109/WIIAT.2008.227
Filename
4740461
Link To Document