• DocumentCode
    3014613
  • Title

    Web document clustering approach using wordnet lexical categories and fuzzy clustering

  • Author

    Gharib, T.F. ; Fouad, Mohammed M. ; Aref, MostafaM

  • Author_Institution
    Fac. of Comput. & Inf. Sci., Ain Shams Univ., Cairo
  • fYear
    2008
  • fDate
    24-27 Dec. 2008
  • Firstpage
    48
  • Lastpage
    55
  • Abstract
    Web mining is defined as applying data mining techniques to the content, structure, and usage of Web resources. The three areas of Web mining are commonly distinguished: content mining, structure mining, and usage mining. In all these areas, a wide range of general data mining techniques, in particular association rule discovery, clustering, classification, and sequence mining, are employed and developed further to reflect the specific structures of Web resources and the specific questions posed in Web mining. In this paper, we introduced a Web document clustering approach that uses WordNet lexical categories and fuzzy c-means algorithm to improve the performance of clustering problem for Web document. Experiments show that fuzzy c-means algorithm achieves great performance optimization with comparison with the recent algorithms for document clustering.
  • Keywords
    Internet; content management; data mining; document handling; fuzzy set theory; pattern clustering; Web document clustering; Web mining; Web resource; WordNet lexical category; association rule discovery; content mining; data mining; fuzzy c-means algorithm; fuzzy clustering; sequence mining; structure mining; usage mining; Artificial intelligence; Association rules; Clustering algorithms; Clustering methods; Computer science; Data mining; Information science; Optimization; Text mining; Web mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
  • Conference_Location
    Khulna
  • Print_ISBN
    978-1-4244-2135-0
  • Electronic_ISBN
    978-1-4244-2136-7
  • Type

    conf

  • DOI
    10.1109/ICCITECHN.2008.4803109
  • Filename
    4803109