DocumentCode :
3028952
Title :
Web Document Classification Based on Fuzzy k-NN Algorithm
Author :
Zhang, Juan ; Niu, Yi ; Nie, Huabei
Author_Institution :
Comput. & Inf. Sci. Dept., Dongguan Univ. of Technol., Dongguan, China
Volume :
1
fYear :
2009
fDate :
11-14 Dec. 2009
Firstpage :
193
Lastpage :
196
Abstract :
Web document classification is an important technique of Web mining. Web pages classification has been studied extensively since the Internet has become a huge database of information. The k-NN is a simple classification algorithm that is used to assign patterns of unknown classification to the class of the majority of its k nearest neighbors of known classification according to the distance measure, but a main drawback of the method is that each of the patterns of known classification is considered equally important in the assignment of the pattern to be classified. Fuzzy k-nearest neighbor (fuzzy k-NN) is improving algorithm of k-NN, which is applied successfully in structural data classification. This paper presents the Web document classification based on fuzzy k-NN network, in the process of classification, TF/IDF (term frequency/inverse document frequency) is adopted for selecting features of document, to increase the accuracy and suit for real world, membership grade is used. Experimental results show that classification performance is better than both k-NN and support vector machine (SVM).
Keywords :
Internet; classification; document handling; fuzzy set theory; pattern classification; Internet; Web document classification; Web mining; Web pages classification; data classification; fuzzy k-NN algorithm; fuzzy k-nearest neighbor; inverse document frequency; term frequency; Cities and towns; Databases; Educational institutions; Information science; Internet; Nearest neighbor searches; Support vector machine classification; Support vector machines; Web mining; Web pages; TF/IDF; Web document classification; fuzzy k-NN;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Security, 2009. CIS '09. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-5411-2
Type :
conf
DOI :
10.1109/CIS.2009.28
Filename :
5376647
Link To Document :
بازگشت