• DocumentCode
    3024925
  • Title

    A Cluster-Based Noise Detection Algorithm

  • Author

    Yin, Hua ; Dong, Hongbin ; Li, Yuxuan

  • fYear
    2009
  • fDate
    25-26 April 2009
  • Firstpage
    386
  • Lastpage
    389
  • Abstract
    For a classification problem, noise in real-world data can dramatically lower the predictive accuracy of a learner and increase the time in building model. Researchers have proved that preprocessing noise before learning can bring more advantages. Previous work mostly focus on class noise detection for the difficulties of attribute noise detection. In this paper, we present a cluster based noise detection algorithm, which synthetically considers attribute and class noise detection. Meanwhile, it has the ability of handling different types of datasets. Our algorithm separately detects class and attributes noise by computing the deviation to the center in the same cluster. we test its effect by adding different types of noise and noise level into datasets from the UCI repository, Our approach shows significant effectiveness in improving the predictive accuracy of classification.
  • Keywords
    data mining; education; noise (working environment); pattern classification; cluster based noise detection algorithm; learner predictive accuracy; modelling; pattern classification; Accuracy; Algorithm design and analysis; Application software; Clustering algorithms; Databases; Detection algorithms; Filters; Noise level; Noise reduction; Predictive models; classification; cluster; data mining; noise detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Technology and Applications, 2009 First International Workshop on
  • Conference_Location
    Wuhan, Hubei
  • Print_ISBN
    978-0-7695-3604-0
  • Type

    conf

  • DOI
    10.1109/DBTA.2009.39
  • Filename
    5207734