• DocumentCode
    2300164
  • Title

    DFSSM based Web text clustering algorithm

  • Author

    Rong Qian ; Kejun Zhang ; Xiaorong Zhao

  • Author_Institution
    Dept. of Comput. Sci., Beijing Electron. Sci. & Technol. Inst., Beijing, China
  • fYear
    2012
  • fDate
    29-31 Dec. 2012
  • Firstpage
    703
  • Lastpage
    707
  • Abstract
    A key challenge of data mining is to tackling the problem of mining richly structured datasets such as Web pages. In this paper, we propose a Web text clustering algorithm (WTCA) based on DFSSM, which is our original work. The algorithm includes the training stage of SOM and the clustering stage. It can distinguish the most meaningful features from the Concept Space without the evaluation function. We applied the algorithm to the Chinese Modern Long-distance Education Network, and compared our work with some popular clustering algorithms. The experimental results show that the average accuracy of WTCA is better than that of the other three algorithms.
  • Keywords
    Web sites; data mining; distance learning; learning (artificial intelligence); pattern clustering; text analysis; Chinese long-distance education network; DFSSM-based Web text clustering algorithm; SOM training stage; WTCA; Web pages; clustering stage; concept space; data mining; SOM; Web text mining; clustering analysis; richly structured datasets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on
  • Conference_Location
    Changchun
  • Print_ISBN
    978-1-4673-2963-7
  • Type

    conf

  • DOI
    10.1109/ICCSNT.2012.6526031
  • Filename
    6526031