Title :
A method for indexing Web pages using Web bots
Author :
Szymanski, Boleslaw K. ; Chung, Ming-Shu
Author_Institution :
Dept. of Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
Today´s search engines use one of two approaches to indexing Web pages. They either (i) analyze the frequency of the words appearing in the entire or a part of the text of the target Web page, or (ii) they use sophisticated algorithms to take into account associations of words in the indexed Web page. The basic difference between the existing methods and the one discussed here is that these methods rely on a structure of Web page linkages that lead from or to the indexed page. In contrast, our method uses the content of the pages linked to or from the indexed page for indexing. So our method uses a structure of words used by the linked pages, whereas the current methods use the structure of the connections between linked pages. We propose and demonstrate usage of a new method based on bots which analyze content of the pages linked to or from the page of interest. We analyze the similarity of the word usage at the different link distance from the page of interest and demonstrate that a structure of words used by the linked pages enables more efficient indexing and search
Keywords :
hypermedia markup languages; indexing; search engines; software agents; Web bots; Web pages; associations; authority pages; automatic indexing; expert pages; indexed page; link distance; linked pages; search; search engines; similarity; word usage; Computer science; Frequency; Humans; Information analysis; Information filtering; Information filters; Machine assisted indexing; Search engines; Web pages; World Wide Web;
Conference_Titel :
Info-tech and Info-net, 2001. Proceedings. ICII 2001 - Beijing. 2001 International Conferences on
Conference_Location :
Beijing
Print_ISBN :
0-7803-7010-4
DOI :
10.1109/ICII.2001.983028