DocumentCode :
935227
Title :
A keyword-based semantic prefetching approach in Internet news services
Author :
Xu, Cheng-Zhong ; Ibrahim, Tamer I.
Author_Institution :
Dept. of Electr. & Comput. Eng., Wayne State Univ., Detroit, MI, USA
Volume :
16
Issue :
5
fYear :
2004
fDate :
5/1/2004 12:00:00 AM
Firstpage :
601
Lastpage :
611
Abstract :
Prefetching is an important technique to reduce the average Web access latency. Existing prefetching methods are based mostly on URL graphs. They use the graphical nature of HTTP links to determine the possible paths through a hypertext system. Although the URL graph-based approaches are effective in prefetching of frequently accessed documents, few of them can prefetch those URLs that are rarely visited. The paper presents a keyword-based semantic prefetching approach to overcome the limitation. It predicts future requests based on semantic preferences of past retrieved Web documents. We apply this technique to Internet news services and implement a client-side personalized prefetching system: NewsAgent. The system exploits semantic preferences by analyzing keywords in URL anchor text of previously accessed documents in different news categories. It employs a neural network model over the keyword set to predict future requests. The system features a self-learning capability and good adaptability to the change of client surfing interest. NewsAgent does not exploit keyword synonymy for conservativeness in prefetching. However, it alleviates the impact of keyword polysemy by taking into account server-provided categorical information in decision-making and, hence, captures more semantic knowledge than term-document literal matching methods. Experimental results from daily browsing of ABC News, CNN, and MSNBC news sites for a period of three months show an achievement of up to 60 percent hit ratio due to prefetching.
Keywords :
Internet; Web sites; learning (artificial intelligence); neural nets; storage management; ABC News; CNN; IMSNBC news sites; Internet news services; NewsAgent; URL anchor text; average Web access latency; client surfing interest; client-side personalized prefetching system; future request prediction; keyword polysemy; keyword-based semantic prefetching; neural network model; past retrieved Web documents; personalized news service; self-learning capability; semantic knowledge; semantic preferences; server-provided categorical information; Decision making; Delay; History; Hypertext systems; Network servers; Neural networks; Predictive models; Prefetching; Uniform resource locators; Web and internet services;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2004.1277820
Filename :
1277820
Link To Document :
بازگشت