Title :
Grouping Web page references into transactions for mining World Wide Web browsing patterns
Author :
Cooley, R. ; Mobasher, B. ; Srivastava, J.
Author_Institution :
Dept. of Comput. Sci. & Eng., Minnesota Univ., Minneapolis, MN, USA
Abstract :
Web-based organizations often generate and collect large volumes of data in their daily operations. Analyzing such data involves the discovery of meaningful relationships from a large collection of primarily unstructured data, often stored in Web server access logs. While traditional domains for data mining, such as point of sale databases, have naturally defined transactions, there is no convenient method of clustering web references into transactions. This paper identifies a model of user browsing behavior that separates web page references into those made for navigation purposes and those for information content purposes. A transaction identification method based on the browsing model is defined and successfully tested against other methods, such as the maximal forward reference algorithm proposed in (Chen et al., 1996). Transactions identified by the proposed methods are used to discover association rules from real world data using the WEBMINER system
Keywords :
Internet; data analysis; human factors; information retrieval; knowledge acquisition; transaction processing; user modelling; very large databases; Internet; WEBMINER system; Web page reference grouping; Web server access logs; Web-based organizations; World Wide Web; association rules; browsing pattern mining; data analysis; data mining; information content; knowledge discovery; large data volumes; maximal forward reference algorithm; point of sale databases; system navigation; transaction identification method; unstructured data; user model; Association rules; Computer science; Data analysis; Data engineering; Data mining; Navigation; Testing; Web pages; Web server; Web sites;
Conference_Titel :
Knowledge and Data Engineering Exchange Workshop, 1997. Proceedings
Conference_Location :
Newport Beach, CA
Print_ISBN :
0-8186-8230-2
DOI :
10.1109/KDEX.1997.629824