• DocumentCode
    2131215
  • Title

    An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

  • Author

    Davy, Michael ; Luz, Saturnino

  • Author_Institution
    Artificial Intell. Group, Trinity Coll. Dublin, Dublin
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    682
  • Lastpage
    691
  • Abstract
    Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates the subset with more informative examples by filtering from the unlabelled pool using a query selection strategy. In this paper we establish whether pre-filtering outperforms sub-sampling optimisation, examine the effect of subset size, and propose a novel adaptive pre-filtering technique which dynamically switches between several alternative pre-filtering techniques using a multi-armed bandit algorithm. Empirical evaluations conducted on two benchmark text categorisation datasets demonstrate that pre-filtered ERS achieve higher levels of accuracy when compared to sub-sampled ERS. The proposed adaptive pre-filtering technique is also shown to be competitive with the optimal pre-filtering technique on the majority of problems and is never the worst technique.
  • Keywords
    adaptive filters; filtering theory; learning (artificial intelligence); query processing; set theory; signal sampling; text analysis; active learning; adaptive prefiltering technique; benchmark text categorisation datasets; empirical evaluations; error-reduction sampling; query selection strategy; subset optimisation; Artificial intelligence; Computer errors; Computer science; Conferences; Data mining; Educational institutions; Learning; Sampling methods; Switches; Text categorization; Active Learning; Error Reduction Sampling; Text Categorisation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
  • Conference_Location
    Pisa
  • Print_ISBN
    978-0-7695-3503-6
  • Electronic_ISBN
    978-0-7695-3503-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2008.52
  • Filename
    4733994