• DocumentCode
    26385
  • Title

    Adaptive Preprocessing for Streaming Data

  • Author

    Zliobaite, Indre ; Gabrys, Bogdan

  • Author_Institution
    Smart Technol. Res. Centre, Bournemouth Univ., Poole, UK
  • Volume
    26
  • Issue
    2
  • fYear
    2014
  • fDate
    Feb. 2014
  • Firstpage
    309
  • Lastpage
    321
  • Abstract
    Many supervised learning approaches that adapt to changes in data distribution over time (e.g., concept drift) have been developed. The majority of them assume that the data comes already preprocessed or that preprocessing is an integral part of a learning algorithm. In real-application tasks, data that comes from, e.g., sensor readings, is typically noisy, contain missing values, redundant features, and a very large part of model development efforts is devoted to data preprocessing. As data is evolving over time, learning models need to be able to adapt to changes automatically. From a practical perspective, automating a predictor makes little sense if preprocessing requires manual adjustment over time. Nevertheless, adaptation of preprocessing has been largely overlooked in research. In this paper, we introduce and address the problem of adaptive preprocessing. We analyze when and under what circumstances it is beneficial to handle adaptivity of preprocessing and adaptivity of the learning model separately. We present three scenarios where handling adaptive preprocessing separately benefits the final prediction accuracy and illustrate them using computational examples. As a result of our analysis, we construct a prototype approach for combining adaptive preprocessing with adaptive predictor online. Our case study with real sensory data from a production process demonstrates that decoupling the adaptivity of preprocessing and the predictor contributes to improving the prediction accuracy. The developed reference framework and our experimental findings are intended to serve as a starting point in systematic research of adaptive preprocessing mechanisms for adaptive learning with evolving data.
  • Keywords
    data handling; learning (artificial intelligence); adaptive predictor; adaptive preprocessing; data distribution; final prediction accuracy; learning algorithm; learning model adaptivity; real-application tasks; sensor readings; sensory data; streaming data; supervised learning approaches; Adaptation models; Adaptive systems; Data models; Feature extraction; Predictive models; Principal component analysis; Supervised learning; Concept drift; adaptive preprocessing; streaming data;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.147
  • Filename
    6247432