DocumentCode
3495803
Title
A Hybrid Information Filtering Algorithm Based on Distributed Web Log Mining
Author
Yun, Ling ; Xun, Wang ; Huamao, Gu
Author_Institution
Coll. of Comput. & Inf. Eng., Zhejiang Gongshang Univ.
Volume
1
fYear
2008
fDate
11-13 Nov. 2008
Firstpage
1086
Lastpage
1091
Abstract
For distributed large commercial mirror sites, this paper presents a hybrid information filtering algorithm based on distributed web log mining. Based on multi-agent technology, this algorithm preprocesses the web logs of mirror sites, in which the web page´s manual rating is replaced by user browsing preference, and then user access matrix is constructed and standardized. On this basis, this paper proposes utilizing web page similarities to predict the rating for pages not having been rated, thus increasing the pages that have been jointly rated among users. This method could effectively solve the sparsity of user ratings in collaborative filtering. Eventually, a hybrid-filtering model is proposed to overcome the drawbacks of the content-based filtering and the collaborative filtering models. The experimental results show that this algorithm is applicable to distributed web server clustering architecture, avoids the inaccuracy and complexity of web page´s manual ratings, effectively solves the faults of traditional filtering models and greatly improve the recommendation quality.
Keywords
Web sites; data mining; information filtering; collaborative filtering; content-based filtering; distributed Web log mining; distributed web server clustering; hybrid information filtering algorithm; manual rating; multiagent technology; recommendation quality; user browsing preference; Clustering algorithms; Collaboration; Filtering algorithms; Information filtering; Information filters; Manuals; Mirrors; Service oriented architecture; Web pages; Web server;
fLanguage
English
Publisher
ieee
Conference_Titel
Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on
Conference_Location
Busan
Print_ISBN
978-0-7695-3407-7
Type
conf
DOI
10.1109/ICCIT.2008.39
Filename
4682178
Link To Document