• DocumentCode
    1811261
  • Title

    Annotation-aware web clustering based on topic model and random walks

  • Author

    Sun, Jiashen ; Wang, Xiaojie ; Yuan, Caixia ; Fang, Guannan

  • Author_Institution
    Dept. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
  • fYear
    2011
  • fDate
    15-17 Sept. 2011
  • Firstpage
    12
  • Lastpage
    16
  • Abstract
    Web page clustering based on semantic or topic promises improved search and browsing on the web. Intuitively, tags from social bookmarking websites such as del.icio.us can be used as a complementary source to document thus improving clustering of web pages. In this paper, we present a novel model which employs topic model to associate annotated document with a distribution of topics, and then constructs a graph including tags, document and topics by performing a Random Walks for clustering. We examine the performance of our model on a real-world data set, illustrating that our model provides improved clustering performance than algorithm utilizing page text alone.
  • Keywords
    Internet; Web sites; document handling; pattern clustering; Web pages; annotation aware Web clustering; document source; random walks; social bookmarking Websites; topic model; Clustering algorithms; Data models; Measurement; Probability; Web pages; Web search; random walks; social tagging; topic model; web clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-61284-203-5
  • Type

    conf

  • DOI
    10.1109/CCIS.2011.6045023
  • Filename
    6045023