Title : 
An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning
         
        
            Author : 
Davy, Michael ; Luz, Saturnino
         
        
            Author_Institution : 
Artificial Intell. Group, Trinity Coll. Dublin, Dublin
         
        
        
        
        
        
            Abstract : 
Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates the subset with more informative examples by filtering from the unlabelled pool using a query selection strategy. In this paper we establish whether pre-filtering outperforms sub-sampling optimisation, examine the effect of subset size, and propose a novel adaptive pre-filtering technique which dynamically switches between several alternative pre-filtering techniques using a multi-armed bandit algorithm. Empirical evaluations conducted on two benchmark text categorisation datasets demonstrate that pre-filtered ERS achieve higher levels of accuracy when compared to sub-sampled ERS. The proposed adaptive pre-filtering technique is also shown to be competitive with the optimal pre-filtering technique on the majority of problems and is never the worst technique.
         
        
            Keywords : 
adaptive filters; filtering theory; learning (artificial intelligence); query processing; set theory; signal sampling; text analysis; active learning; adaptive prefiltering technique; benchmark text categorisation datasets; empirical evaluations; error-reduction sampling; query selection strategy; subset optimisation; Artificial intelligence; Computer errors; Computer science; Conferences; Data mining; Educational institutions; Learning; Sampling methods; Switches; Text categorization; Active Learning; Error Reduction Sampling; Text Categorisation;
         
        
        
        
            Conference_Titel : 
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
         
        
            Conference_Location : 
Pisa
         
        
            Print_ISBN : 
978-0-7695-3503-6
         
        
            Electronic_ISBN : 
978-0-7695-3503-6
         
        
        
            DOI : 
10.1109/ICDMW.2008.52