DocumentCode :
401805
Title :
SVM based Chinese Web page automatic classification
Author :
Liang, Jiu-Zhen
Author_Institution :
Inst. of Comput. Sci., Zhejiang Normal Univ., Jinhua, China
Volume :
4
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
2265
Abstract :
This paper deals with Chinese web page classification based on support vector machine (SVM). First, Some methods are proposed for feature extraction and selection based on textual keywords. Then Special problems are discussed on statistic learning theory, support vector machine and their application in classification. Quadratic program algorithm is also described for constructing the SVM classifier. In the experiment part, the sample set, including 5096 samples, is chosen from the web version of Chinese People´s Daily. It is separated into two sets, the training set with 3398 samples and the test set with 1698 samples. Two kinds of kernel function, polynomial and radial basis function, are considered in constructing the SVM classifier. The final classification correct rates are 89.81%, 86.51% for the two classifiers, respectively.
Keywords :
Web sites; feature extraction; learning (artificial intelligence); polynomials; quadratic programming; radial basis function networks; statistical analysis; support vector machines; text analysis; Chinese Web page; SVM; automatic classification; feature extraction; kernel function; polynomial function; quadratic program algorithm; radial basis function; statistic learning theory; support vector machine; textual keyword; Classification algorithms; Feature extraction; Kernel; Machine learning; Polynomials; Statistics; Support vector machine classification; Support vector machines; Testing; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1259884
Filename :
1259884
Link To Document :
بازگشت