• DocumentCode
    671566
  • Title

    Active learning in nonstationary environments

  • Author

    Capo, Robert ; Dyer, Karl B. ; Polikar, Robi

  • Author_Institution
    Electr. & Comput. Eng. Dept., Rowan Univ., Glassboro, NJ, USA
  • fYear
    2013
  • fDate
    4-9 Aug. 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Increasing number of practical applications that involve streaming nonstationary data have led to a recent surge in algorithms designed to learn from such data. One challenging version of this problem that has not received as much attention, however, is learning streaming nonstationary data when a small initial set of data are labeled, with unlabeled data being available thereafter. We have recently introduced the COMPOSE algorithm for learning in such scenarios, which we refer to as initially labeled nonstationary streaming data. COMPOSE works remarkably well, however it requires limited (gradual) drift, and cannot address special cases such as introduction of a new class or significant overlap of existing classes, as such scenarios cannot be learned without additional labeled data. Scenarios that provide occasional or periodic limited labeled data are not uncommon, however, for which many of COMPOSE´s restrictions can be lifted. In this contribution, we describe a new version of COMPOSE as a proof-of-concept algorithm that can identify the instances whose labels - if available - would be most beneficial, and then combine those instances with unlabeled data to actively learn from streaming nonstationary data, even when the distribution of the data experiences abrupt changes. On two carefully designed experiments that include abrupt changes as well as addition of new classes, we show that COMPOSE.AL significantly outperforms original COMPOSE, while closely tracking the optimal Bayes classifier performance.
  • Keywords
    Bayes methods; data analysis; learning (artificial intelligence); pattern classification; COMPOSE algorithm; active learning; compacted object sample extraction; data distribution; nonstationary data streaming; nonstationary environments; nonstationary streaming data; occasional limited labeled data; optimal Bayes classifier performance; periodic limited labeled data; proof-of-concept algorithm; unlabeled data; Algorithm design and analysis; Classification algorithms; Complexity theory; Data mining; Measurement; Training data; Uncertainty; COMPOSE; active learning; concept drift; non-stationary environment; streaming data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2013 International Joint Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4673-6128-6
  • Type

    conf

  • DOI
    10.1109/IJCNN.2013.6706906
  • Filename
    6706906