Title :
Classification with reject option in text categorisation systems
Author :
Fumera, Giorgio ; Pillai, Ignazio ; Roli, Fabio
Author_Institution :
Dept. of Electr. & Electron. Eng., Cagliari, Italy
Abstract :
The aim of this paper is to evaluate the potential usefulness of the reject option for text categorisation (TC) tasks. The reject option is a technique used in statistical pattern recognition for improving classification reliability. Our work is motivated by the fact that, although the reject option proved to be useful in several pattern recognition problems, it has not yet been considered for TC tasks. Since TC tasks differ from usual pattern recognition problems in the performance measures used and in the fact that documents can belong to more than one category, we developed a specific rejection technique for TC problems. The performance improvement achievable by using the reject option was experimentally evaluated on the Reuters dataset, which is a standard benchmark for TC systems.
Keywords :
information retrieval; pattern classification; statistical analysis; text analysis; Reuters dataset benchmark; classification reliability; documents; performance measures; reject option; statistical pattern recognition; text categorisation systems; Indexing; Information filtering; Information filters; Information retrieval; Knowledge engineering; Learning systems; Machine learning; Pattern recognition; Text categorization; Web pages;
Conference_Titel :
Image Analysis and Processing, 2003.Proceedings. 12th International Conference on
Print_ISBN :
0-7695-1948-2
DOI :
10.1109/ICIAP.2003.1234113