• DocumentCode
    3237904
  • Title

    The problem of noise in classification: Past, current and future work

  • Author

    Yin, Hua ; Dong, Hongbin

  • Author_Institution
    State Key Lab. of Software Eng., Wuhan Univ., Wuhan, China
  • fYear
    2011
  • fDate
    27-29 May 2011
  • Firstpage
    412
  • Lastpage
    416
  • Abstract
    Data have been accumulated to wait for being analyzed in real world. But the imperfection of data complicates the analysis process. According to “garbage in, garbage out”, model built on such data will mislead the following study. Multiple empirical studies have showed that noise in dataset dramatically decrease the classification accuracy and increase the complexity of classification. Therefore, the problem of noise in classification is always the focus in machine learning and data mining. At the same time, noise is uncertain, so the problem is also a difficult and open problem. For systematically studying the problem, we summarize and analyze the main researches from the aspects of noise model, method of handling noise and algorithms of handling noise. Based on the past and current work, we discuss some new directions in solving the problem.
  • Keywords
    data analysis; data mining; learning (artificial intelligence); pattern classification; storage management; classification complexity; data accumulation; data analysis; data classification noise; data mining; garbage in garbage out model; machine learning; noise handling method; Analytical models; Atmospheric modeling; Data models; Filtering algorithms; Noise; Noise measurement; Robustness; attribute noise; class noise; handling noise; noise model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on
  • Conference_Location
    Xi´an
  • Print_ISBN
    978-1-61284-485-5
  • Type

    conf

  • DOI
    10.1109/ICCSN.2011.6014597
  • Filename
    6014597