Title :
Linked Topic and Interest Model for Web Forums
Author :
Cheng, Victor ; Li, C.H.
Author_Institution :
Dept. of Comput. Sci., Hong Kong Baptist Univ., Kowloon
Abstract :
In Web forum analysis, both the discussion topics and author interests are greatly concerned. We introduce a linked topic and interest model based on latent Dirichlet allocation (LDA) to explore discussion topics and author interests. Rather than having two separate models or modeling combined topics and interests with just one hidden topic assignment variable, the proposed model has separate but linked hidden variables for topic and interest exploration. As exact model parameter inference is intractable, Gibbs sampling is employed to estimate topic, author, and interest distributions. The joint distribution of the linked hidden variables also provides an interpretation of an interest in terms of weighted topics or vice versa. We apply the model to a NIPS data set and a corpus containing text contents of a popular digital camera Web forum. Topics and interests discovered by using the model is demonstrated. The model generalization capability is also assessed by means of perplexity and the results show that the linked topic and interest model has performance exceeding that of LDA document topic model and author topic model.
Keywords :
Internet; sampling methods; text analysis; Gibbs sampling; Web forum analysis; author interests model; discussion topic; latent Dirichlet allocation; model generalization capability; Computer science; Content based retrieval; Digital cameras; Indexing; Information analysis; Information retrieval; Intelligent agent; Large scale integration; Linear discriminant analysis; Sampling methods; LDA; topic model; user interest; web forums;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
DOI :
10.1109/WIIAT.2008.227