DocumentCode
1811261
Title
Annotation-aware web clustering based on topic model and random walks
Author
Sun, Jiashen ; Wang, Xiaojie ; Yuan, Caixia ; Fang, Guannan
Author_Institution
Dept. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
fYear
2011
fDate
15-17 Sept. 2011
Firstpage
12
Lastpage
16
Abstract
Web page clustering based on semantic or topic promises improved search and browsing on the web. Intuitively, tags from social bookmarking websites such as del.icio.us can be used as a complementary source to document thus improving clustering of web pages. In this paper, we present a novel model which employs topic model to associate annotated document with a distribution of topics, and then constructs a graph including tags, document and topics by performing a Random Walks for clustering. We examine the performance of our model on a real-world data set, illustrating that our model provides improved clustering performance than algorithm utilizing page text alone.
Keywords
Internet; Web sites; document handling; pattern clustering; Web pages; annotation aware Web clustering; document source; random walks; social bookmarking Websites; topic model; Clustering algorithms; Data models; Measurement; Probability; Web pages; Web search; random walks; social tagging; topic model; web clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-61284-203-5
Type
conf
DOI
10.1109/CCIS.2011.6045023
Filename
6045023
Link To Document