Title :
Bayesian analysis of online newspaper log data
Author :
Wettig, Hannes ; Lahtinen, Jussi ; Lepola, Tuomas ; Myllymaki, Petri ; Tirri, Henry
Author_Institution :
Complex Syst. Comput. Group (CoSCo), Helsinki Univ., Finland
Abstract :
In this paper we address the problem of analyzing Web log data collected at a typical online newspaper site. We propose a two-way clustering technique based on probability theory. On one hand the suggested method clusters the readers of the online newspaper into user groups of similar browsing behaviour where the clusters are determined solely based on the click streams collected. On the other hand, the articles of the newspaper are clustered based on the reading behaviour of the users. The two-way clustering produces statistical user and page profiles that can be analyzed by domain experts for content personalization. In addition, the produced model can also be used for on-line prediction so that given the user cluster of a person entering the site, and the page cluster of an article of a newspaper one can infer whether or not the user will have a look at the page in question.
Keywords :
Bayes methods; Internet; Web sites; belief networks; electronic publishing; information retrieval; pattern clustering; Bayesian analysis; Web log data analysis; click streams; content personalization; on-line prediction; online newspaper log data; online newspaper site; probability theory; reader clustering; reading behaviour; similar browsing behaviour; statistical page profiles; statistical user profiles; two-way clustering technique; user groups; Bayesian methods; Cleaning; Clustering algorithms; Conferences; Data mining; Demography; Internet; Law; Legal factors; Uniform resource locators;
Conference_Titel :
Applications and the Internet Workshops, 2003. Proceedings. 2003 Symposium on
Print_ISBN :
0-7695-1873-7
DOI :
10.1109/SAINTW.2003.1210173