Title :
Evolving stream classification using change detection
Author :
Mustafa, Albara ; Haque, Ashraful ; Khan, Latifur ; Baron, Michael ; Thuraisingham, Bhavani
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Dallas, Richardson, TX, USA
Abstract :
Classifying instances in evolving data stream is a challenging task because of its properties, e.g., infinite length, concept drift, and concept evolution. Most of the currently available approaches to classify stream data instances divide the stream data into fixed size chunks to fit the data in memory and process the fixed size chunk one after another. However, this may lead to failure of capturing the concept drift immediately. We try to determine the chunk size dynamically by exploiting change point detection (CPD) techniques on stream data. In general, the distribution families before and after the change point are unknown over the stream, therefore non-parametric CPD algorithms are used in this case. We propose a multi-dimensional non-parametric CPD technique to determine chunk boundary over data streams dynamically which leads to better models to classify instances of evolving data streams. Experimental results show that our approach can detect the change points and classify instances of evolving data stream with high accuracy as compared to other baseline approaches.
Keywords :
nonparametric statistics; pattern classification; change point detection techniques; chunk boundary; chunk size; distribution families; evolving data stream classification; multidimensional nonparametric CPD technique; nonparametric algorithms; Data models; Decision trees; Equations; Heuristic algorithms; Histograms; Malware; Training;
Conference_Titel :
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2014 International Conference on
Conference_Location :
Miami, FL