Title :
Session Identification Algorithm for Web Log Mining
Author :
Peng Zhu ; Zhao, Ming-sheng
Author_Institution :
Dept. of Inf. Manage., Nanjing Univ., Nanjing, China
Abstract :
This paper takes session identification in web log mining as research object, proposes an improved algorithm based on average time threshold value. By calculating the average intervals dynamically among request records in the session, adjusting the time threshold value individually, and compared to the traditional algorithm that defines a uniform threshold value for all users´ web pages, the algorithm in this paper can identify the long session more accurately. At last, the algorithm re-identifies the generated sets of candidate session, which make the identified session more reasonable and effective. Experiment result shows that the quality of session identification is improved.
Keywords :
Internet; data mining; Web log mining; Web pages; average time threshold value; research object; session identification; uniform threshold value; Algorithm design and analysis; Cleaning; Data mining; Heuristic algorithms; IP networks; Indexes; Knowledge engineering;
Conference_Titel :
Management and Service Science (MASS), 2010 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5325-2
Electronic_ISBN :
978-1-4244-5326-9
DOI :
10.1109/ICMSS.2010.5576547