Title :
Incremental One-Class Bagging for Streaming and Evolving Big Data
Author :
Bartosz Krawczyk;Michal Wozniak
Author_Institution :
Dept. of Syst. &
Abstract :
Modern machine learning systems need to be able to efficiently process big data. Extracting useful patterns from massive collection of objects requires not only accurate, but also fast algorithms with limited computational complexity. However, one should remember that the problem with massive datasets lies not only in their volume. There is a number of difficulties embedded in the nature of data, that must be properly addressed in order to design an efficient learning system. In this paper we address multiple problems related to big data analytics. We assume the streaming nature of our data. Additionally, we work in non-stationary environment where nature of data may constantly change. Finally, we consider a situation where not object from one class are only available what leads us to the one-class classification task. We propose a novel incremental ensemble of weighted one-class classifiers, based on boosting. Our learners adapt to evolving nature of data stream by changing weights assigned to objects and forgetting outdated examples. The proposed bagging scheme allows for diversifying the pool of individual classifiers which can run in a distributed computing environment. We propose to maintain the diversity of the ensemble by updating each classifier with a bootstrap sample from incoming stream. Experimental study proves the usefulness of our approach in scenarios, where we need to process massive and evolving data streams without the access to counterexamples.
Keywords :
"Big data","Training","Data mining","Support vector machines","Bagging","Learning systems","Machine learning algorithms"
Conference_Titel :
Trustcom/BigDataSE/ISPA, 2015 IEEE
DOI :
10.1109/Trustcom.2015.582