Title :
An algorithm for automatic Web-page clustering using link structures
Author :
Mukhopadhyay, Debdeep ; Sing, Sanasam Rinbir
Author_Institution :
Dept. of Comput. Sci. & Eng., St. Thomas Coll. of Eng. & Technol., Kolkata, India
Abstract :
Web contains a large collection of heterogeneous documents. As a result, finding set of related pages from Web is currently facing one of the most crucial problems. The low precision Web search engines like Excite, Alta Vista etc. coupled with the ranked list presentation make it harder for users to find the information they are looking for. In this paper, we have proposed a methodology to cluster related pages using co-citations without manual study and/or predefined categories. These clusters are used to classify random pages in the Universe.
Keywords :
Internet; citation analysis; search engines; text analysis; automatic Web-page clustering; cocitation; heterogeneous document; link structure; precision Web search engine; ranked list presentation; Clustering algorithms; Computer science; Educational institutions; Ink; Large-scale systems; Motion pictures; Search engines; Text categorization; Web pages; Web search;
Conference_Titel :
India Annual Conference, 2004. Proceedings of the IEEE INDICON 2004. First
Print_ISBN :
0-7803-8909-3
DOI :
10.1109/INDICO.2004.1497798