Title of article :
Improving Performance of Text Categorization by
Combining Filtering and Support Vector Machines
Author/Authors :
Irene D?´az، نويسنده , , Jose´ Ranilla، نويسنده , , Elena Montan? es، نويسنده , , Javier Ferna´ ndez، نويسنده , , and El?´as F. Combarro، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2004
Abstract :
Text Categorization is the process of assigning documents
to a set of previously fixed categories. A lot of
research is going on with the goal of automating this
time-consuming task. Several different algorithms have
been applied, and Support Vector Machines (SVM) have
shown very good results. In this report, we try to prove
that a previous filtering of the words used by SVM in the
classification can improve the overall performance. This
hypothesis is systematically tested with three different
measures of word relevance, on two different corpus
(one of them considered in three different splits), and
with both local and global vocabularies. The results
show that filtering significantly improves the recall of the
method, and that also has the effect of significantly
improving the overall performance.
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology