• DocumentCode
    3274232
  • Title

    An Algorithm of Semi-supervised Web-Page Classification Based on Fuzzy Clustering

  • Author

    Geng, Chen ; Yuquan, Zhu ; Jianing, Tan ; Tianhan, Hu

  • Author_Institution
    Jiangsu Key Lab. of Audit Inf. Eng., Nanjing Audit Univ., Nanjing, China
  • Volume
    1
  • fYear
    2009
  • fDate
    15-17 May 2009
  • Firstpage
    3
  • Lastpage
    7
  • Abstract
    It is very difficult to obtain labeled training samples. However, it is very easy to obtain non-labeled training samples. So,it is important task that how to classify Web-page using these training samples. An Algorithm called FC-TSVM based on fuzzy clustering is proposed. The algorithm FC-TSVM uses the fuzzy clustering algorithm to determine the number of positive label samples, and add the information of homepages hyperlink as part of the classifications. The experiments show that the algorithm FC-TSVM can efficiently improve the accuracy and stability of web page classification.
  • Keywords
    Web sites; fuzzy set theory; pattern clustering; FC-TSVM; fuzzy clustering; nonlabeled training samples; semisupervised Web page classification; Application software; Classification algorithms; Clustering algorithms; Computer science; Information technology; Internet; Laboratories; Semisupervised learning; Support vector machine classification; Support vector machines; fuzzy clustering; semi-supervised learning; transductive Support Vector Machines; web-page classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Applications, 2009. IFITA '09. International Forum on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-0-7695-3600-2
  • Type

    conf

  • DOI
    10.1109/IFITA.2009.490
  • Filename
    5231499