DocumentCode
1810393
Title
An algorithm for automatic Web-page clustering using link structures
Author
Mukhopadhyay, Debdeep ; Sing, Sanasam Rinbir
Author_Institution
Dept. of Comput. Sci. & Eng., St. Thomas Coll. of Eng. & Technol., Kolkata, India
fYear
2004
fDate
20-22 Dec. 2004
Firstpage
472
Lastpage
477
Abstract
Web contains a large collection of heterogeneous documents. As a result, finding set of related pages from Web is currently facing one of the most crucial problems. The low precision Web search engines like Excite, Alta Vista etc. coupled with the ranked list presentation make it harder for users to find the information they are looking for. In this paper, we have proposed a methodology to cluster related pages using co-citations without manual study and/or predefined categories. These clusters are used to classify random pages in the Universe.
Keywords
Internet; citation analysis; search engines; text analysis; automatic Web-page clustering; cocitation; heterogeneous document; link structure; precision Web search engine; ranked list presentation; Clustering algorithms; Computer science; Educational institutions; Ink; Large-scale systems; Motion pictures; Search engines; Text categorization; Web pages; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
India Annual Conference, 2004. Proceedings of the IEEE INDICON 2004. First
Print_ISBN
0-7803-8909-3
Type
conf
DOI
10.1109/INDICO.2004.1497798
Filename
1497798
Link To Document