• DocumentCode
    3127680
  • Title

    Active Learning from Positive and Unlabeled Data

  • Author

    Ghasemi, Alireza ; Rabiee, Hamid R. ; Fadaee, Mohsen ; Manzuri, Mohammad T. ; Rohban, Mohammad H.

  • fYear
    2011
  • fDate
    11-11 Dec. 2011
  • Firstpage
    244
  • Lastpage
    250
  • Abstract
    During recent years, active learning has evolved into a popular paradigm for utilizing user´s feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and querying the label of that point from user. Many different methods such as uncertainty sampling and minimum risk sampling have been utilized to select the most informative sample in active learning. Although many active learning algorithms have been proposed so far, most of them work with binary or multi-class classification problems and therefore can not be applied to problems in which only samples from one class as well as a set of unlabeled data are available. Such problems arise in many real-world situations and are known as the problem of learning from positive and unlabeled data. In this paper we propose an active learning algorithm that can work when only samples of one class as well as a set of unlabeled data are available. Our method works by separately estimating probability density of positive and unlabeled points and then computing expected value of in formativeness to get rid of a hyper-parameter and have a better measure of in formativeness. Experiments and empirical analysis show promising results compared to other similar methods.
  • Keywords
    data handling; learning (artificial intelligence); pattern classification; probability; query processing; active learning; label querying; multiclass classification problems; positive data; probability density; unlabeled data; user feedback; Accuracy; Entropy; Estimation; Learning systems; Measurement uncertainty; Training; Uncertainty; active learning; learning from positive and unlabeled data; one-class learning; semi-supervised learning; uncertainty sampling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    978-1-4673-0005-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2011.20
  • Filename
    6137386