DocumentCode :
688449
Title :
User Interest Profile Identification Using Wikipedia Knowledge Database
Author :
Huakang Li ; Longbin Lai ; Xiaofeng Xu ; Yao Shen ; Xiangyang Xu ; Chunrong Xia
Author_Institution :
Juangsu High Technol. Res. Key Lab. for Wireless Sensor Networks, Nanjing Univ. of Posts & TELE, Nanjing, China
fYear :
2013
fDate :
13-15 Nov. 2013
Firstpage :
2362
Lastpage :
2367
Abstract :
The interesting, targeted, relevant advertisement is considered as one of the most honest proceeds for personalizing recommendation. Topic identification is the most important technique for the unstructured Web pages. Conventional content classification approaches based on bag of words are difficult to process massive Web pages. In this paper, Wikipedia Category Network (WCN) nodes are used to identify a Web page topic and estimate user´s interest profile. Wikipedia is the largest contents knowledge database and updated dynamically. A basic interest data set is marked for WCN. The topic characterization for each WCN node is generated with the depth and breadth of the interest data set. To reduce the deviation of the breadth, a family generation algorithm is proposed to estimate the generation weight in WCN. Finally, an interest decay model based on URL number is proposed to represent user´s interest profile in time period. Experimental results illustrated that the performance of Web page topic identification is significant using WCN with family model, and the profile identification model has a dynamical performance for active users.
Keywords :
Web sites; advertising; deductive databases; recommender systems; URL number; WCN nodes; Web page topic identification; Wikipedia category network nodes; Wikipedia knowledge database; active users; advertisement; breadth deviation reduction; content classification approaches; dynamic update; dynamical performance; family generation algorithm; family model; generation weight estimation; interest data set breadth; interest data set depth; interest decay model; massive Web page processing; profile identification model; recommendation personalization; time period; topic characterization; unstructured Web pages; user interest profile estimation; user interest profile identification; Electronic publishing; Encyclopedias; Games; Internet; Vectors; Web pages; URL decay model; Web page Classification; Wikipedia knowledge network; family similarity; user profile;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location :
Zhangjiajie
Type :
conf
DOI :
10.1109/HPCC.and.EUC.2013.340
Filename :
6832223
Link To Document :
بازگشت