DocumentCode :
1811225
Title :
User interest modeling by labeled LDA with topic features
Author :
Li, Wenfeng ; Wang, Xiaojie ; Hu, Rile ; Tian, Jilei
Author_Institution :
Center of Intell. Sci. & Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
fYear :
2011
fDate :
15-17 Sept. 2011
Firstpage :
6
Lastpage :
11
Abstract :
As well known, the user interest is carried in the user´s web browsing history that can be mined out. This paper presents an innovative method to extract user´s inter.ests from his/her web browsing history. We first apply an efficient algorithm to extract useful texts from the web pages in user´s browsed URL sequence. We then proposed a Labeled Latent Dirichlet Allocation with Topic Feature (LLDA-TF) to mine user´s interests from the texts. Unlike other works that need a lot of training data to train a model to adopt supervised information, we directly introduce the raw supervised information to the procedure of LLDA-TF. As shown in the experimental results, results given by LLDA-TF fit predefined categories well. Furthermore, LLDA-TF model can name the user interests by category words as well as a keyword list for each category.
Keywords :
Internet; human computer interaction; information retrieval; text analysis; unsupervised learning; user modelling; LLDA-TF model; Web browsing history; Web pages; innovative method; labeled LDA; latent Dirichlet allocation; text extraction; topic features; unsupervised model; user interest extraction; user interest modeling; Data models; Encyclopedias; Feature extraction; Internet; Training; Web pages; Labeled Latent Dirichlet Allocation with topic Feature (LLDA-TF); browsing history; topic feature; topic model; user interest model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-61284-203-5
Type :
conf
DOI :
10.1109/CCIS.2011.6045022
Filename :
6045022
Link To Document :
بازگشت