• DocumentCode
    1810393
  • Title

    An algorithm for automatic Web-page clustering using link structures

  • Author

    Mukhopadhyay, Debdeep ; Sing, Sanasam Rinbir

  • Author_Institution
    Dept. of Comput. Sci. & Eng., St. Thomas Coll. of Eng. & Technol., Kolkata, India
  • fYear
    2004
  • fDate
    20-22 Dec. 2004
  • Firstpage
    472
  • Lastpage
    477
  • Abstract
    Web contains a large collection of heterogeneous documents. As a result, finding set of related pages from Web is currently facing one of the most crucial problems. The low precision Web search engines like Excite, Alta Vista etc. coupled with the ranked list presentation make it harder for users to find the information they are looking for. In this paper, we have proposed a methodology to cluster related pages using co-citations without manual study and/or predefined categories. These clusters are used to classify random pages in the Universe.
  • Keywords
    Internet; citation analysis; search engines; text analysis; automatic Web-page clustering; cocitation; heterogeneous document; link structure; precision Web search engine; ranked list presentation; Clustering algorithms; Computer science; Educational institutions; Ink; Large-scale systems; Motion pictures; Search engines; Text categorization; Web pages; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    India Annual Conference, 2004. Proceedings of the IEEE INDICON 2004. First
  • Print_ISBN
    0-7803-8909-3
  • Type

    conf

  • DOI
    10.1109/INDICO.2004.1497798
  • Filename
    1497798