Title :
Study of active learning in the challenge
Author :
Chen, Yukun ; Mani, Shoma
Author_Institution :
Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA
Abstract :
In the active learning challenge, we aim to improve the area under the learning curve (ALC), the global score in the challenge, by optimizing the classification methods and feature selection methods, and most importantly by refining the querying algorithm to select the most informative instances in the early iteration of active learning. For six different datasets in the development phase, we applied general and specific methods to resolve unbalanced class, sparse data, and missing value problems. We designed a voting system based on multi models to combine good prediction with robust performance in different types of datasets. For querying methods, we modified the approach of information density, firstly, to avoid the exhaustive comparison for all samples, and secondly to find more representative samples. We also propose two modified versions of uncertainty sampling based methods: uncertainty sampling with bias, which takes into account the high imbalance of data, and uncertainty sampling with prediction, which predicts the most uncertain samples based on the change of uncertain values during the active learning process. We present our preliminary results of the development datasets in the active learning challenge and discuss their significance in this paper.
Keywords :
feature extraction; iterative methods; learning (artificial intelligence); pattern classification; query processing; uncertainty handling; active learning; classification method; feature selection; information density; iterative method; querying algorithm; uncertainty sampling; Biological system modeling; Classification algorithms; Entropy; Mathematical model; Predictive models; Training; Uncertainty;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596776