Title :
Metadata propagation in the Web using co-citations
Author :
Prime-Claverie, Camille ; Beigbeder, Michel ; Lafouge, Thierry
Author_Institution :
Lab. RIM/G21, Ecole Nationale Superieure des Mines, Saint-Etienne, France
Abstract :
Given the large heterogeneity of the World Wide Web, using metadata on the search engines side seems to be a useful track for information retrieval. Though, because a manual qualification at the Web scale is not accessible, this track is little followed. We propose a semi-automatic method for propagating metadata. In a first step, homogeneous corpus are extracted. We used in our study the following properties: the authority type, the site type, the information type, and the page type. This first step is realized by a clusterization which uses a similarity measure based on the co-citation frequency between pages. Given the cluster hierarchy, the second step selects a reduced number of documents to be manually qualified and propagates the given metadata values to the other documents belonging to the same cluster. A qualitative evaluation and a preliminary study about the scalability of this method are presented.
Keywords :
Internet; citation analysis; meta data; search engines; World Wide Web; authority type; co-citation; information retrieval; information type; metadata propagation; page type; search engine; site type; Data mining; Databases; Frequency measurement; Information retrieval; Libraries; Qualifications; Scalability; Search engines; Web pages; Web sites;
Conference_Titel :
Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM International Conference on
Print_ISBN :
0-7695-2415-X