DocumentCode :
482165
Title :
Preprocessing and Feature Preparation in Chinese Web Page Classification
Author :
Huang, Weitong ; Xu, Luxiong ; Liu, Yanmin
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
Volume :
1
fYear :
2009
fDate :
22-24 Jan. 2009
Firstpage :
64
Lastpage :
67
Abstract :
A detailed design and implementation of a Chinese Web-page classification system is described in this paper, and some methods on Chinese Web-page preprocessing and feature preparation are proposed. Experimental results on a Chinese Web-page dataset show that methods we designed can improve the performance from 75.82% to 81.88%.
Keywords :
Web sites; classification; natural language processing; Chinese Web page classification; Chinese Web-page dataset; Chinese Web-page preprocessing; feature preparation; Application software; Computer applications; Computer science; Data mining; Design engineering; HTML; Navigation; Particle separators; Vocabulary; Web pages; Chinese web-page preprocessing; Feature preparation; Text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering and Technology, 2009. ICCET '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-3334-6
Type :
conf
DOI :
10.1109/ICCET.2009.72
Filename :
4769428
Link To Document :
بازگشت