Title :
Applying cascaded feature selection to SVM text categorization
Author :
Masuyama, Takeshi ; Nakagawa, Hiroshi
Author_Institution :
Inf. Technol. Center, Tokyo Univ., Japan
Abstract :
This paper investigates the effect of a cascaded feature selection (CFS) in SVM text categorization. Unlike existing feature selections, our method (CFS) has two advantages. One can make use of the characteristic of each feature (word). Another is that unnecessary test documents for a category, which should be categorized into a negative set, can be removed in the first step. Compared with the method which does not apply CFS, our method achieved significant good performance especially about the categories which contain a small number of training documents.
Keywords :
data mining; feature extraction; learning (artificial intelligence); text analysis; SVM text categorization; cascaded feature selection; test documents; training documents; Humans; Information technology; Organizing; Quality management; Search engines; Support vector machine classification; Support vector machines; Testing; Text categorization; Web sites;
Conference_Titel :
Database and Expert Systems Applications, 2002. Proceedings. 13th International Workshop on
Print_ISBN :
0-7695-1668-8
DOI :
10.1109/DEXA.2002.1045905