Title :
Similarity Measurement of Web Sessions by Sequence Alignment
Author :
Li, Chaofeng ; Lu, Yansheng
Author_Institution :
South-Central Univ. for Nationalities, Wuhan
Abstract :
The task of clustering Web sessions is to group Web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The first and foremost question needed to be considered in clustering Web sessions is how to measure the similarity between Web sessions. However, there are many shortcomings in traditional measurements. This paper analyses the shortcomings of traditional methods and introduces a new method to measure similarities between Web pages, which considers not only the URL but also the viewing time of the visited Web page. Then we propose a new method to measure the similarity of Web sessions using sequence alignment and the similarity of Web page access. Finally, we conclude this paper and propose the future research directions.
Keywords :
Internet; pattern clustering; sequences; URL; Web pages; Web session clustering; inter-group similarity; sequence alignment; similarity measurement; Clustering algorithms; Computational biology; Data mining; Educational institutions; Euclidean distance; Navigation; Sequences; Time measurement; Uniform resource locators; Web pages;
Conference_Titel :
Network and Parallel Computing Workshops, 2007. NPC Workshops. IFIP International Conference on
Conference_Location :
Liaoning
Print_ISBN :
978-0-7695-2943-1
DOI :
10.1109/NPC.2007.66