• DocumentCode
    890884
  • Title

    Olex: Effective Rule Learning for Text Categorization

  • Author

    Rullo, Pasquale ; Policicchio, Veronica Lucia ; Cumbo, Chiara ; Iiritano, Salvatore

  • Author_Institution
    Dept. of Math., Univ. of Calabria, Rende
  • Volume
    21
  • Issue
    8
  • fYear
    2009
  • Firstpage
    1118
  • Lastpage
    1132
  • Abstract
    This paper describes Olex, a novel method for the automatic induction of rule-based text classifiers. Olex supports a hypothesis language of the form "if T1 or hellip or Tn occurs in document d, and none of T1+n,... Tn+m occurs in d, then classify d under category c," where each Ti is a conjunction of terms. The proposed method is simple and elegant. Despite this, the results of a systematic experimentation performed on the REUTERS-21578, the OHSUMED, and the ODP data collections show that Olex provides classifiers that are accurate, compact, and comprehensible. A comparative analysis conducted against some of the most well-known learning algorithms (namely, Naive Bayes, Ripper, C4.5, SVM, and Linear Logistic Regression) demonstrates that it is more than competitive in terms of both predictive accuracy and efficiency.
  • Keywords
    knowledge based systems; learning (artificial intelligence); pattern classification; text analysis; Olex; rule learning; rule-based text classifiers; text categorization; Clustering; Data mining; Mining methods and algorithms; Text mining; and association rules; classification; classification and association rules; clustering; mining methods and algorithms.; text mining;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2008.206
  • Filename
    4641927