Title :
Core support extraction for learning from initially labeled nonstationary environments using COMPOSE
Author :
Capo, Robert ; Sanchez, Abel ; Polikar, Robi
Author_Institution :
Dept. of Electr. & Comput. Eng., Rowan Univ., Glassboro, NJ, USA
Abstract :
Learning in nonstationary environments, also called concept drift, requires an algorithm to track and learn from streaming data, drawn from a nonstationary (drifting) distribution. When data arrive continuously, a concept drift algorithm is required to maintain an up-to-date hypothesis that evolves with the changing environment. A more difficult problem that has received less attention, however, is learning from so-called initially labeled nonstationary environments, where the the environment provides only unlabeled data after initialization. Since the labels to such data never become available, learning in such a setting is also referred to as extreme verification latency, where the algorithm must only use unlabeled data to keep the hypothesis current. In this contribution, we analyze COMPOSE, a framework recently proposed for learning in such environments. One of the central processes of COMPOSE is core support extraction, where the algorithm predicts which data instances will be useful and relevant for classification in future time steps. We compare two different options, namely Gaussian mixture model based maximum a posteriori sampling and a-shape compaction, for core support extraction, and analyze their effects on both accuracy and computational complexity of the algorithm. Our findings point to-as is the case in most engineering problems - a trade-off: that a-shapes are more versatile in most situations, but they are far more computationally complex, especially as the dimensionality of the dataset increases. Our proposed GMM procedure allows COMPOSE to operate on datasets of substantially larger dimensionality without affecting its classification performance.
Keywords :
Gaussian processes; learning (artificial intelligence); maximum likelihood estimation; pattern classification; COMPOSE framework; Gaussian mixture model; a-shape compaction; classification performance; concept drift algorithm; concept drift learning; core support extraction; drifting distribution; extreme verification latency; initially labeled nonstationary environment; maximum a posteriori sampling; Compaction; Data mining; Gaussian mixture model; Prediction algorithms; Shape; Training data;
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
DOI :
10.1109/IJCNN.2014.6889917