Title :
WhatNext: a prediction system for Web requests using n-gram sequence models
Author :
Su, Zhong ; Yang, Qiang ; Lu, Ye ; Zhang, Hongjang
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
Abstract :
As an increasing number of users access information on the Web, there is a great opportunity to learn from the server logs to learn about the users´ probable actions in the future. We present an n-gram based model to utilize path profiles of users from very large data sets to predict the users´ future requests. Since this is a prediction system, we cannot measure the recall in a traditional sense. We, therefore, present the notion of applicability to give a measure of the ability to predict the next document. Our model is based on a simple extension of existing point-based models for such predictions, but our results show for n-gram based prediction when n is greater than three, we can increase precision by 20% or more for two realistic Web logs. Also we present an efficient method that can compress our model to 30% of its original size so that the model can be loaded in main memory. Our result can potentially be applied to a wide range of applications on the Web, including pre-sending, pre-fetching, enhancement of recommendation systems as well as Web caching policies. Our tests are based on three realistic Web logs. Our algorithm is implemented in a prediction system called WhatNext, which shows a marked improvement in precision and applicability over previous approaches
Keywords :
Internet; data mining; information resources; information retrieval; very large databases; Web caching policies; Web logs; Web request prediction system; WhatNext; data mining; information access; n-gram sequence models; pre-fetching; pre-sending; recommendation systems; server logs; user path profiles; very large data sets; Data mining; Information retrieval; Internet; Load modeling; Predictive models; Sea measurements; Search engines; Telecommunication traffic; Testing; Web server;
Conference_Titel :
Web Information Systems Engineering, 2000. Proceedings of the First International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-0577-5
DOI :
10.1109/WISE.2000.882395