DocumentCode :
585853
Title :
Cut-off time calculation for user session identification by reference length
Author :
Kapusta, Jozef ; Munk, Michal ; Drlík, Martin
Author_Institution :
Dept. of Inf., Constantine the Philosopher Univ. in Nitra, Nitra, Slovakia
fYear :
2012
fDate :
17-19 Oct. 2012
Firstpage :
1
Lastpage :
6
Abstract :
One of the methods of web log mining is also discovering patterns of behavior of web site visitors. Based on the found users´ behavior patterns that are represented by sequence rules, it is possible to modify and improve web site of the organization. Data for the analysis are gained from the web server log file. These anonymous data represent the problem of unique identification of the web site visitor. The paper deals with less commonly used navigation-driven methods of user session identification. These methods assume that the user goes over several navigation pages during her/his visit until she/he finds the content page with required information. The content page is a page where the user spends considerably more time in comparison with navigation pages. The content page is considered to be the end of the session. Searching of the next content page using navigation pages constitutes a new user session. The division of pages into content and navigation pages is based on the calculation of cut-off time C. The verification of exponential distribution of variable that represents the time which user spent on the particular page is coessential. We prepared an experiment with data gained from log file of university web server. We tried to verify, if the time spent on web pages has exponential distribution and we estimated the value of cut-off time. The found results confirm our assumptions that the navigation oriented methods could be used to proper user session identification.
Keywords :
Internet; Web sites; behavioural sciences; data analysis; data mining; exponential distribution; Web log mining; Web pages; Web server log file; Web site visitor behavior; Web site visitor identification; content page; cut-off time calculation; data analysis; exponential distribution; exponential variable distribution; navigation-driven methods; pattern discovery; reference length; sequence rules; user behavior patterns; user session identification; Educational institutions; Exponential distribution; IP networks; Navigation; Web pages; Web servers; Cut-off Time; Reference Length; Session Identification; Web Log Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Application of Information and Communication Technologies (AICT), 2012 6th International Conference on
Conference_Location :
Tbilisi
Print_ISBN :
978-1-4673-1739-9
Type :
conf
DOI :
10.1109/ICAICT.2012.6398500
Filename :
6398500
Link To Document :
بازگشت