• DocumentCode
    2850455
  • Title

    Dynamic classifier selection for effective mining from noisy data streams

  • Author

    Zhu, Xingquan ; Wu, Xindong ; Yang, Ying

  • Author_Institution
    Dept. of Comput. Sci., Vermont Univ., Burlington, VT, USA
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    305
  • Lastpage
    312
  • Abstract
    Mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular solution is to separate stream data into chunks, learn a base classifier from each chunk, and then integrate all base classifiers for effective classification. In this paper, we propose a dynamic classifier selection (DCS) mechanism to integrate base classifiers for effective mining from data streams. The proposed algorithm dynamically selects a single "best" classifier to classify each test instance at run time. Our scheme uses statistical information from attribute values, and uses each attribute to partition the evaluation set into disjoint subsets, followed by a procedure that evaluates the classification accuracy of each base classifier on these subsets. Given a test instance, its attribute values determine the subsets that the similar instances in the evaluation set have constructed, and the classifier with the highest classification accuracy on those subsets is selected to classify the test instance. Experimental results and comparative studies demonstrate the efficiency and efficacy of our method. Such a DCS scheme appears to be promising in mining data streams with dramatic concept drifting or with a significant amount of noise, where the base classifiers are likely conflictive or have low confidence.
  • Keywords
    data mining; noise; pattern classification; base classifier; dynamic classifier selection; noisy data stream mining; Application software; Computer science; Credit cards; Data mining; Distributed control; Heuristic algorithms; Partitioning algorithms; Protection; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10091
  • Filename
    1410298