DocumentCode
402856
Title
A Web document clustering algorithm based on concept of neighbor
Author
Song, Jiang-Chun ; Shen, Jun-Yi
Author_Institution
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., China
Volume
1
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
46
Abstract
As the WWW developed rapidly, it becomes the most important resource gradually that transfers and shares the global information as well as being full of the latent capacity. Recent years, the researches of the Web mining have been concerned broadly and gotten a great deal of achievements simultaneously. The nearest neighbor technique, which is a hierarchical clustering method based on distance has been applied to many cases widely for the efficiency and validity. In this paper, based on the vector space model (VSM) of the Web documents, we improved the nearest neighbor method, put forward a new Web document clustering algorithm, and researched the validity and scalability of the algorithm, the time and space complexity of the algorithm.
Keywords
Web sites; computational complexity; data mining; information retrieval systems; unsupervised learning; Web document clustering algorithm; Web mining; World Wide Web; global information; nearest neighbor method; space complexity; time complexity; unsupervised learning; vector space model; Clustering algorithms; Clustering methods; Computer science; Data mining; Nearest neighbor searches; Pattern analysis; Scalability; Unsupervised learning; Web mining; World Wide Web;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1264440
Filename
1264440
Link To Document