DocumentCode
245123
Title
Contrary to Popular Belief Incremental Discretization can be Sound, Computationally Efficient and Extremely Useful for Streaming Data
Author
Webb, Geoffrey I.
Author_Institution
Fac. of Inf. Technol., Monash Univ., Melbourne, VIC, Australia
fYear
2014
fDate
14-17 Dec. 2014
Firstpage
1031
Lastpage
1036
Abstract
Discretization of streaming data has received surprisingly little attention. This might be because streaming data require incremental discretization with cut points that may vary over time and this is perceived as undesirable. We argue, to the contrary, that it can be desirable for a discretization to evolve in synchronization with an evolving data stream, even when the learner assumes that attribute values´ meanings remain invariant over time. We examine the issues associated with discretization in the context of distribution drift and develop computationally efficient incremental discretization algorithms. We show that discretization can reduce the error of a classical incremental learner and that allowing a discretization to drift in synchronization with distribution drift can further reduce error.
Keywords
data handling; synchronisation; data streaming; distribution drift; incremental discretization; synchronization; Approximation algorithms; Context; Electricity; Histograms; Synchronization; Time-frequency analysis; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2014 IEEE International Conference on
Conference_Location
Shenzhen
ISSN
1550-4786
Print_ISBN
978-1-4799-4303-6
Type
conf
DOI
10.1109/ICDM.2014.123
Filename
7023442
Link To Document