Title :
A Novel Approach to Naive Bayes Web Page Automatic Classification
Author :
He, Zhongli ; Liu, Zhijing
Author_Institution :
Sch. of Comput. Sci. & Technol., Xidian Univ., Xian
Abstract :
In this paper, a novel approach of Web page classification using Naive Bayes (NB) classifier based on independent component analysis (ICA) is proposed. In order to perform the classification, a Web page is firstly represented by a vector of features with different weights, and the weight calculated method is improved. As the number of the features is big, principal component analysis (PCA) which is to select the relevant features will perform in preprocessing section as input for improved ICA algorithm (MFICA). Finally, the output of MFICA is sent to NB classifier for classification to boost the classifierpsilas performance. The experimental evaluation demonstrates that the NB classifier based on ICA model provides acceptable classification accuracy.
Keywords :
Bayes methods; Web sites; independent component analysis; information analysis; principal component analysis; Naive Bayes classifier; Web page automatic classification; independent component analysis; principal component analysis; Data mining; Fuzzy systems; Independent component analysis; Information retrieval; Internet; Neural networks; Niobium; Principal component analysis; Signal processing algorithms; Web pages;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
DOI :
10.1109/FSKD.2008.284