DocumentCode :
3393691
Title :
Data collection for evaluating automatic filtering of hazardous WWW information
Author :
Hoashi, Keiichiro ; Inoue, Naomi ; Hashimoto, Kazuo
Author_Institution :
KDD R&D Labs. Inc., Saitama, Japan
fYear :
1999
fDate :
1999
Firstpage :
157
Lastpage :
164
Abstract :
We describe our data collection constructed for the evaluation of automatic filtering of hazardous WWW information. Currently, there are three types of filtering systems: self rating, individual rating and automatic filtering. We propose an ideal system architecture for effective filtering based on the analysis of existing systems. For the development of our filtering system, we have collected a massive amount of hazardous WWW data. We presumed that WWW pages with few words are difficult to filter automatically, but analysis on our data collection has proved that effective automatic filtering can be achieved by applying the hierarchy of HTML data. We have also practically proved this hypothesis by evaluation experiments using an experimental automatic filtering algorithm
Keywords :
Internet; hypermedia markup languages; information resources; information retrieval; HTML; Internet; Web pages; World Wide Web; automatic information filtering; data collection; hazardous Web information; individual rating; self rating; Data analysis; Filtering algorithms; HTML; Information filtering; Information filters; Internet; Laboratories; Research and development; Web pages; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Internet Workshop, 1999. IWS 99
Conference_Location :
Osaka
Print_ISBN :
0-7803-5925-9
Type :
conf
DOI :
10.1109/IWS.1999.811008
Filename :
811008
Link To Document :
بازگشت