Title :
Investigation of internet system user behaviour using cluster analysis
Author :
Dariusz Krol;Michal Scigajlo;Bogdan Trawinski
Author_Institution :
Wroc?aw University of Technology, Institute of Applied Informatics, Wyb. Wyspia?skiego 27, 50-370, Poland
fDate :
7/1/2008 12:00:00 AM
Abstract :
The method of the investigation of information Web system userspsila activity using a clustering method is presented in the paper. On the basis of a Web server log, anonymous sessions are determined in the form of a 65 dimensional vector, where dimensions represent individual Web system pages. Each dimension comprises the value of a measure of user interest in a page during a given session. This value is calculated as a ratio of time user spent visiting a given page to the total time of a session. Then the whole set of sessions is clustered using HCM (Hard C-Means) algorithm. The resulting clusters are assumed as the user activity patterns and among them clusters dominated by a page are selected as those where the user interest value exceeds a given threshold value e.g. 50 per cent. The sessions of named users, registered in the system, are determined using an application log of user activity. The frequencies of named user sessions, comprised by individual clusters, are calculated for a given period of time e.g. one month. The user activity can be assessed by analyzing frequencies obtained. For example, the user behavior can be regarded as deviated from normal pattern when the frequency of a session in a cluster dominated by a page is below a determined threshold value e.g. 10 per cent. The method was evaluated using data from a cadastral Web system exploited in an extranet.
Keywords :
"Internet","Machine learning","Cybernetics","Clustering algorithms","Local government","Data mining","Web server"
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
2160-1348
DOI :
10.1109/ICMLC.2008.4620993