Title :
A Classification Approach of News Web Pages from Multi-Media Sources at Chinese Entry Website-Taiwan Yahoo! as an Example
Author :
Chiu, Deng-Yiv ; Lee, Chi-Chung ; Pan, Ya-Chen
Author_Institution :
Dept. of Inf. Manage., Chung Hua Univ., Hsinchu, Taiwan
Abstract :
There exists numerous news obviously classified into incorrect categories on Chinese Web pages portals. For example, the news dated Aug 13, 2008 with title of "An 78 year-old man completed his Bachelor\´s degree" is classified incorrectly into class politics at Taiwan Yahoo Web site. This phenomenon is owing to mainly the difficulty in automatically classifying Chinese news and the fact that news appearing on Web page portals are retrieved from numerous various media sources having various categories. We utilize genetic algorithm to select four feature thresholds used to obtain representative features of each class, and to construct the vector space model for each document. The multi-class SVM classifier is then trained to construct an appropriate classifier to perform automatic classification error detection of Chinese news classification.
Keywords :
Web sites; genetic algorithms; pattern classification; portals; support vector machines; Chinese Web pages portals; Chinese entry Web site; Taiwan Yahoo; automatic classification error detection; feature threshold selection; genetic algorithm; multiclass support vector machine classifier; multimedia sources; news Web pages classification approach; Bayesian methods; Electronic mail; Fuzzy set theory; Genetic algorithms; Information management; Multimedia computing; Portals; Support vector machine classification; Support vector machines; Web pages;
Conference_Titel :
Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-1-4244-5543-0
DOI :
10.1109/ICICIC.2009.3