DocumentCode
890884
Title
Olex: Effective Rule Learning for Text Categorization
Author
Rullo, Pasquale ; Policicchio, Veronica Lucia ; Cumbo, Chiara ; Iiritano, Salvatore
Author_Institution
Dept. of Math., Univ. of Calabria, Rende
Volume
21
Issue
8
fYear
2009
Firstpage
1118
Lastpage
1132
Abstract
This paper describes Olex, a novel method for the automatic induction of rule-based text classifiers. Olex supports a hypothesis language of the form "if T1 or hellip or Tn occurs in document d, and none of T1+n,... Tn+m occurs in d, then classify d under category c," where each Ti is a conjunction of terms. The proposed method is simple and elegant. Despite this, the results of a systematic experimentation performed on the REUTERS-21578, the OHSUMED, and the ODP data collections show that Olex provides classifiers that are accurate, compact, and comprehensible. A comparative analysis conducted against some of the most well-known learning algorithms (namely, Naive Bayes, Ripper, C4.5, SVM, and Linear Logistic Regression) demonstrates that it is more than competitive in terms of both predictive accuracy and efficiency.
Keywords
knowledge based systems; learning (artificial intelligence); pattern classification; text analysis; Olex; rule learning; rule-based text classifiers; text categorization; Clustering; Data mining; Mining methods and algorithms; Text mining; and association rules; classification; classification and association rules; clustering; mining methods and algorithms.; text mining;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2008.206
Filename
4641927
Link To Document