Title :
Web news classification using neural networks based on PCA
Author :
Selamat, Ali ; Yanagimoto, Hidekazu ; Omatu, Sigeru
Author_Institution :
Eng. Dept., Osaka Prefecture Univ., Sakai, Japan
Abstract :
In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profile-based features (CPBF). The fixed number of regular words from each class will be used as a feature vectors with the reduced features from the PCA. These feature vectors are then used as the input to the neural networks for classification. The experimental evaluation demonstrates that the WPCM provides acceptable classification accuracy with the sports news datasets.
Keywords :
Internet; classification; neural nets; principal component analysis; CPBF; PCA; WPCM; Web news classification; class profile-based features; neural networks; news Web page classification method; principal components; Frequency; Indexing; Information retrieval; Large scale integration; Neural networks; Principal component analysis; Systems engineering and theory; Text categorization; Web pages; World Wide Web;
Conference_Titel :
SICE 2002. Proceedings of the 41st SICE Annual Conference
Print_ISBN :
0-7803-7631-5
DOI :
10.1109/SICE.2002.1195784