DocumentCode :
3274232
Title :
An Algorithm of Semi-supervised Web-Page Classification Based on Fuzzy Clustering
Author :
Geng, Chen ; Yuquan, Zhu ; Jianing, Tan ; Tianhan, Hu
Author_Institution :
Jiangsu Key Lab. of Audit Inf. Eng., Nanjing Audit Univ., Nanjing, China
Volume :
1
fYear :
2009
fDate :
15-17 May 2009
Firstpage :
3
Lastpage :
7
Abstract :
It is very difficult to obtain labeled training samples. However, it is very easy to obtain non-labeled training samples. So,it is important task that how to classify Web-page using these training samples. An Algorithm called FC-TSVM based on fuzzy clustering is proposed. The algorithm FC-TSVM uses the fuzzy clustering algorithm to determine the number of positive label samples, and add the information of homepages hyperlink as part of the classifications. The experiments show that the algorithm FC-TSVM can efficiently improve the accuracy and stability of web page classification.
Keywords :
Web sites; fuzzy set theory; pattern clustering; FC-TSVM; fuzzy clustering; nonlabeled training samples; semisupervised Web page classification; Application software; Classification algorithms; Clustering algorithms; Computer science; Information technology; Internet; Laboratories; Semisupervised learning; Support vector machine classification; Support vector machines; fuzzy clustering; semi-supervised learning; transductive Support Vector Machines; web-page classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology and Applications, 2009. IFITA '09. International Forum on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3600-2
Type :
conf
DOI :
10.1109/IFITA.2009.490
Filename :
5231499
Link To Document :
بازگشت