DocumentCode :
1918192
Title :
Topic distillation on hierarchically categorized Web documents
Author :
Katz, Vadim ; Li, Wen-Syan
fYear :
1999
fDate :
1999
Firstpage :
34
Lastpage :
41
Abstract :
As an alternative to search capability, many search engines are providing directory servers containing categorized Web documents for users to navigate and browse through. We are investigating three issues in portal site construction given a large collection of categorized Web documents: (1) distillation of important topics for each category of documents; (2) distillation of important documents/sites for these topics; and (3) automation of these two tasks. We have developed an automated technique for topics and Web site distillation. Our technique integrates Web document content analysis and link structure analysis. It considers local importance of keywords and their global distribution statistics on a given Web document category hierarchy
Keywords :
document handling; information resources; information retrieval; search engines; Web document category hierarchy; Web document content analysis; Web site distillation; automated technique; categorized Web documents; directory servers; global distribution statistics; hierarchically categorized Web documents; link structure analysis; portal site construction; search capability; search engines; topic distillation; Portals; Search engines; Uniform resource locators;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge and Data Engineering Exchange, 1999. (KDEX '99) Proceedings. 1999 Workshop on
Conference_Location :
Chicago, IL
Print_ISBN :
0-7695-0453-1
Type :
conf
DOI :
10.1109/KDEX.1999.836529
Filename :
836529
Link To Document :
بازگشت