• DocumentCode
    1798345
  • Title

    Domain adaptation bounds for multiple expert systems under concept drift

  • Author

    Ditzler, Gregory ; Rosen, Gail ; Polikar, Robi

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Drexel Univ., Philadelphia, PA, USA
  • fYear
    2014
  • fDate
    6-11 July 2014
  • Firstpage
    595
  • Lastpage
    601
  • Abstract
    The ability to learn incrementally from streaming data - either in an online or batch setting - is of crucial importance for a prediction algorithm to learn from environments that generate vast amounts of data, where it is impractical or simply unfeasible to store all historical data. On the other hand, learning from streaming data becomes increasingly difficult when the probability distribution generating the data stream evolves over time, which renders the classification model generated from previously seen data suboptimal or potentially useless. Ensemble systems that employ multiple classifiers may be used to mitigate this effect, but even in such cases some classifiers (experts) become less knowledgeable for predicting on different domains than others as the distribution drifts. Further complication results when labeled data from a prediction (target) domain is not immediately available; hence, causing prediction on the target domain to yield sub-optimal results. In this work, we provide upper bounds on the loss, which hold with high probability, of a multiple expert system trained in such a nonstationary environment with verification latency. Furthermore, we show why a single model selection strategy can lead to undesirable results when learning in such nonstationary streaming settings. We present our analytical results with experiments on simulated as well as real-world data sets, comparing several different ensemble approaches to a single model.
  • Keywords
    expert systems; learning (artificial intelligence); pattern classification; probability; classification; data streaming; domain adaptation bounds; ensemble systems; learning algorithms; multiple expert systems; nonstationary environment; nonstationary streaming settings; prediction algorithm; probability distribution; single model selection strategy; verification latency; Expert systems; Labeling; Loss measurement; Prediction algorithms; Probability distribution; Training; Upper bound;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), 2014 International Joint Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4799-6627-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.2014.6889909
  • Filename
    6889909