Title :
An Algorithm of Semi-supervised Web-Page Classification Based on Fuzzy Clustering
Author :
Geng, Chen ; Yuquan, Zhu ; Jianing, Tan ; Tianhan, Hu
Author_Institution :
Jiangsu Key Lab. of Audit Inf. Eng., Nanjing Audit Univ., Nanjing, China
Abstract :
It is very difficult to obtain labeled training samples. However, it is very easy to obtain non-labeled training samples. So,it is important task that how to classify Web-page using these training samples. An Algorithm called FC-TSVM based on fuzzy clustering is proposed. The algorithm FC-TSVM uses the fuzzy clustering algorithm to determine the number of positive label samples, and add the information of homepages hyperlink as part of the classifications. The experiments show that the algorithm FC-TSVM can efficiently improve the accuracy and stability of web page classification.
Keywords :
Web sites; fuzzy set theory; pattern clustering; FC-TSVM; fuzzy clustering; nonlabeled training samples; semisupervised Web page classification; Application software; Classification algorithms; Clustering algorithms; Computer science; Information technology; Internet; Laboratories; Semisupervised learning; Support vector machine classification; Support vector machines; fuzzy clustering; semi-supervised learning; transductive Support Vector Machines; web-page classification;
Conference_Titel :
Information Technology and Applications, 2009. IFITA '09. International Forum on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3600-2
DOI :
10.1109/IFITA.2009.490