DocumentCode :
3704185
Title :
Incremental One-Class Bagging for Streaming and Evolving Big Data
Author :
Bartosz Krawczyk;Michal Wozniak
Author_Institution :
Dept. of Syst. &
Volume :
2
fYear :
2015
Firstpage :
193
Lastpage :
198
Abstract :
Modern machine learning systems need to be able to efficiently process big data. Extracting useful patterns from massive collection of objects requires not only accurate, but also fast algorithms with limited computational complexity. However, one should remember that the problem with massive datasets lies not only in their volume. There is a number of difficulties embedded in the nature of data, that must be properly addressed in order to design an efficient learning system. In this paper we address multiple problems related to big data analytics. We assume the streaming nature of our data. Additionally, we work in non-stationary environment where nature of data may constantly change. Finally, we consider a situation where not object from one class are only available what leads us to the one-class classification task. We propose a novel incremental ensemble of weighted one-class classifiers, based on boosting. Our learners adapt to evolving nature of data stream by changing weights assigned to objects and forgetting outdated examples. The proposed bagging scheme allows for diversifying the pool of individual classifiers which can run in a distributed computing environment. We propose to maintain the diversity of the ensemble by updating each classifier with a bootstrap sample from incoming stream. Experimental study proves the usefulness of our approach in scenarios, where we need to process massive and evolving data streams without the access to counterexamples.
Keywords :
"Big data","Training","Data mining","Support vector machines","Bagging","Learning systems","Machine learning algorithms"
Publisher :
ieee
Conference_Titel :
Trustcom/BigDataSE/ISPA, 2015 IEEE
Type :
conf
DOI :
10.1109/Trustcom.2015.582
Filename :
7345495
Link To Document :
بازگشت