Title :
Real time Web usage mining: a heuristic based distributed miner
Author :
Masseglia, Florent ; Teisseire, Maguelonne ; Poncelet, Pascal
Author_Institution :
LIRMM, CNRS, Montpellier, France
Abstract :
The behaviour of a Web site\´s users may change so quickly that attempting to make predictions, according to the frequent patterns coming from the analysis of an access log file, becomes challenging. In order for the obsolescence of the behavioural patterns to become as as possible, the ideal method would provide frequent patterns in real time, allowing the result to be available immediately. We propose, in this paper a method allowing to find frequent behavioural patterns in real time, whatever the number of connected users is. Considering how fast the frequent behaviour patterns can change since the last analysis of the access log file, this result thus provide completely adapted navigation schemas for user behaviour predictions. Based on a distributed heuristic, our method also answers several tackled problems within the data mining framework: Discovering "interesting zones" (a great number of frequent patterns concentrated over a period of time, or the discovering of "super-frequent" patterns), discovering very long sequential patterns and interactive data mining ("on the fly" modification of the minimum support).
Keywords :
data mining; real-time systems; access log file; behavioural patterns; data mining; distributed heuristic; interactive data mining; long sequential patterns; zone mining; Data mining; Electronic commerce; Information analysis; Navigation; Organizing; Pattern analysis; Performance analysis; Uniform resource locators; Web server; Web sites;
Conference_Titel :
Web Information Systems Engineering, 2001. Proceedings of the Second International Conference on
Print_ISBN :
0-7695-1393-X
DOI :
10.1109/WISE.2001.996490