Title :
Clustering of web sessions by FOGSAA
Author :
Chakraborty, Arpan ; Bandyopadhyay, Supriyo
Author_Institution :
Machine Intell. Unit, Indian Stat. Inst., Kolkata, India
Abstract :
Clustering of the web sessions to identify the vis-itors´ choices while browsing the web pages, is an important problem in web mining. The sequence of pages viewed by the user in a particular time-frame, i.e., the session, captures his/her interest in a specific topic. Clustering of these sessions is therefore needed to provide customized services to the users having similar interests. In this article, we propose a novel and accurate similarity measure, Psim, between two web pages and a method of clustering the web sessions using a recently developed Fast Optimal Global Sequence Alignment Algorithm (FOGSAA). FOGSAA is an optimal global alignment algorithm which is used to align the pairs of sessions. It computes the pair-wise distances, which is used to cluster the sessions in similar groups. FOGSAA aligns the sessions in much less time and results in an average time gain of 35.84% over the conventional dynamic programming based Needleman-Wunsch´s method, where both are generating the same optimal alignment. Therefore, application of FOGSAA to align the sessions makes the procedure faster and at the same time maintains the quality.
Keywords :
Internet; data mining; dynamic programming; information retrieval; pattern clustering; FOGSAA; Needleman-Wunsch method; Web mining; Web pages; Web session clustering; dynamic programming; fast optimal global sequence alignment algorithm; pairwise distance; similarity measure; Clustering algorithms; Dynamic programming; NASA; Servers; Time measurement; Web mining; Web pages;
Conference_Titel :
Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in
Conference_Location :
Trivandrum
Print_ISBN :
978-1-4799-2177-5
DOI :
10.1109/RAICS.2013.6745488