Title :
Selective sampling based on the variation in label assignments
Author :
Juszczak, Piotr ; Duin, Robert P W
Author_Institution :
Fac. of Electr. Eng., Math. & Comput. Sci., Delft Univ. of Technol., Netherlands
Abstract :
In this paper, a new selective sampling method for the active learning framework is presented. Initially, a small training set T and a large unlabeled set Ω are given. The goal is to select, one by one, the most informative objects from Ω such that, after labeling by an expert, they guarantee the best improvement in the classifier performance. Our sampling strategy relies on measuring the variation in label assignments (of the unlabeled set) between the classifier trained on T and the classifiers trained on T with a single unlabeled object added with all possible labels. We compare the performance of our algorithm with two traditional procedures random sampling and uncertainty sampling. We show empirically across a range of datasets that the proposed selective sampling method decreases the number of labeled instances needed to achieve the desired error for the fixed size of T. Experimental results on toy problems and the UCI datasets are presented.
Keywords :
learning (artificial intelligence); pattern classification; random processes; sampling methods; UCI datasets; active learning model; label assignment variation; pattern classifier; random sampling; selective sampling method; uncertainty sampling; Computer science; Error analysis; Labeling; Machine learning; Mathematics; Sampling methods; Semisupervised learning; Statistics; Testing; Uncertainty;
Conference_Titel :
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
Print_ISBN :
0-7695-2128-2
DOI :
10.1109/ICPR.2004.1334545