• DocumentCode
    3495803
  • Title

    A Hybrid Information Filtering Algorithm Based on Distributed Web Log Mining

  • Author

    Yun, Ling ; Xun, Wang ; Huamao, Gu

  • Author_Institution
    Coll. of Comput. & Inf. Eng., Zhejiang Gongshang Univ.
  • Volume
    1
  • fYear
    2008
  • fDate
    11-13 Nov. 2008
  • Firstpage
    1086
  • Lastpage
    1091
  • Abstract
    For distributed large commercial mirror sites, this paper presents a hybrid information filtering algorithm based on distributed web log mining. Based on multi-agent technology, this algorithm preprocesses the web logs of mirror sites, in which the web page´s manual rating is replaced by user browsing preference, and then user access matrix is constructed and standardized. On this basis, this paper proposes utilizing web page similarities to predict the rating for pages not having been rated, thus increasing the pages that have been jointly rated among users. This method could effectively solve the sparsity of user ratings in collaborative filtering. Eventually, a hybrid-filtering model is proposed to overcome the drawbacks of the content-based filtering and the collaborative filtering models. The experimental results show that this algorithm is applicable to distributed web server clustering architecture, avoids the inaccuracy and complexity of web page´s manual ratings, effectively solves the faults of traditional filtering models and greatly improve the recommendation quality.
  • Keywords
    Web sites; data mining; information filtering; collaborative filtering; content-based filtering; distributed Web log mining; distributed web server clustering; hybrid information filtering algorithm; manual rating; multiagent technology; recommendation quality; user browsing preference; Clustering algorithms; Collaboration; Filtering algorithms; Information filtering; Information filters; Manuals; Mirrors; Service oriented architecture; Web pages; Web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on
  • Conference_Location
    Busan
  • Print_ISBN
    978-0-7695-3407-7
  • Type

    conf

  • DOI
    10.1109/ICCIT.2008.39
  • Filename
    4682178