• DocumentCode
    402856
  • Title

    A Web document clustering algorithm based on concept of neighbor

  • Author

    Song, Jiang-Chun ; Shen, Jun-Yi

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., China
  • Volume
    1
  • fYear
    2003
  • fDate
    2-5 Nov. 2003
  • Firstpage
    46
  • Abstract
    As the WWW developed rapidly, it becomes the most important resource gradually that transfers and shares the global information as well as being full of the latent capacity. Recent years, the researches of the Web mining have been concerned broadly and gotten a great deal of achievements simultaneously. The nearest neighbor technique, which is a hierarchical clustering method based on distance has been applied to many cases widely for the efficiency and validity. In this paper, based on the vector space model (VSM) of the Web documents, we improved the nearest neighbor method, put forward a new Web document clustering algorithm, and researched the validity and scalability of the algorithm, the time and space complexity of the algorithm.
  • Keywords
    Web sites; computational complexity; data mining; information retrieval systems; unsupervised learning; Web document clustering algorithm; Web mining; World Wide Web; global information; nearest neighbor method; space complexity; time complexity; unsupervised learning; vector space model; Clustering algorithms; Clustering methods; Computer science; Data mining; Nearest neighbor searches; Pattern analysis; Scalability; Unsupervised learning; Web mining; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2003 International Conference on
  • Print_ISBN
    0-7803-8131-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2003.1264440
  • Filename
    1264440