Title :
Normalized and classified feature selection in text categorization
Author :
Wang, Xiujuan ; Guo, Jun ; Zheng, Kangfeng
Author_Institution :
Sch. of Inf. & Eng., Beijing Univ. of Posts & Telecommun., China
Abstract :
Feature selection is a valid method to reduce the dimension of text vector in automatic text categorization system. The paper finds a defect among several normal evaluation functions based on experiments data and proposes that normalization should be taken into these methods as a necessary step. Furthermore, the paper also brings forward a new idea named classified feature selection that applies traditional evaluation function among each class. Experiments prove the validity of these two solutions.
Keywords :
pattern classification; text analysis; classified feature selection; evaluation functions; normalized feature selection; text categorization; text vector; Automatic testing; Dictionaries; Entropy; Frequency; Internet; Mutual information; Text categorization;
Conference_Titel :
Communications and Information Technology, 2005. ISCIT 2005. IEEE International Symposium on
Print_ISBN :
0-7803-9538-7
DOI :
10.1109/ISCIT.2005.1566826