• DocumentCode
    2232480
  • Title

    A method for indexing Web pages using Web bots

  • Author

    Szymanski, Boleslaw K. ; Chung, Ming-Shu

  • Author_Institution
    Dept. of Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
  • Volume
    3
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    1
  • Abstract
    Today´s search engines use one of two approaches to indexing Web pages. They either (i) analyze the frequency of the words appearing in the entire or a part of the text of the target Web page, or (ii) they use sophisticated algorithms to take into account associations of words in the indexed Web page. The basic difference between the existing methods and the one discussed here is that these methods rely on a structure of Web page linkages that lead from or to the indexed page. In contrast, our method uses the content of the pages linked to or from the indexed page for indexing. So our method uses a structure of words used by the linked pages, whereas the current methods use the structure of the connections between linked pages. We propose and demonstrate usage of a new method based on bots which analyze content of the pages linked to or from the page of interest. We analyze the similarity of the word usage at the different link distance from the page of interest and demonstrate that a structure of words used by the linked pages enables more efficient indexing and search
  • Keywords
    hypermedia markup languages; indexing; search engines; software agents; Web bots; Web pages; associations; authority pages; automatic indexing; expert pages; indexed page; link distance; linked pages; search; search engines; similarity; word usage; Computer science; Frequency; Humans; Information analysis; Information filtering; Information filters; Machine assisted indexing; Search engines; Web pages; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Info-tech and Info-net, 2001. Proceedings. ICII 2001 - Beijing. 2001 International Conferences on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-7010-4
  • Type

    conf

  • DOI
    10.1109/ICII.2001.983028
  • Filename
    983028