• DocumentCode
    245123
  • Title

    Contrary to Popular Belief Incremental Discretization can be Sound, Computationally Efficient and Extremely Useful for Streaming Data

  • Author

    Webb, Geoffrey I.

  • Author_Institution
    Fac. of Inf. Technol., Monash Univ., Melbourne, VIC, Australia
  • fYear
    2014
  • fDate
    14-17 Dec. 2014
  • Firstpage
    1031
  • Lastpage
    1036
  • Abstract
    Discretization of streaming data has received surprisingly little attention. This might be because streaming data require incremental discretization with cut points that may vary over time and this is perceived as undesirable. We argue, to the contrary, that it can be desirable for a discretization to evolve in synchronization with an evolving data stream, even when the learner assumes that attribute values´ meanings remain invariant over time. We examine the issues associated with discretization in the context of distribution drift and develop computationally efficient incremental discretization algorithms. We show that discretization can reduce the error of a classical incremental learner and that allowing a discretization to drift in synchronization with distribution drift can further reduce error.
  • Keywords
    data handling; synchronisation; data streaming; distribution drift; incremental discretization; synchronization; Approximation algorithms; Context; Electricity; Histograms; Synchronization; Time-frequency analysis; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2014 IEEE International Conference on
  • Conference_Location
    Shenzhen
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4799-4303-6
  • Type

    conf

  • DOI
    10.1109/ICDM.2014.123
  • Filename
    7023442