DocumentCode :
3237904
Title :
The problem of noise in classification: Past, current and future work
Author :
Yin, Hua ; Dong, Hongbin
Author_Institution :
State Key Lab. of Software Eng., Wuhan Univ., Wuhan, China
fYear :
2011
fDate :
27-29 May 2011
Firstpage :
412
Lastpage :
416
Abstract :
Data have been accumulated to wait for being analyzed in real world. But the imperfection of data complicates the analysis process. According to “garbage in, garbage out”, model built on such data will mislead the following study. Multiple empirical studies have showed that noise in dataset dramatically decrease the classification accuracy and increase the complexity of classification. Therefore, the problem of noise in classification is always the focus in machine learning and data mining. At the same time, noise is uncertain, so the problem is also a difficult and open problem. For systematically studying the problem, we summarize and analyze the main researches from the aspects of noise model, method of handling noise and algorithms of handling noise. Based on the past and current work, we discuss some new directions in solving the problem.
Keywords :
data analysis; data mining; learning (artificial intelligence); pattern classification; storage management; classification complexity; data accumulation; data analysis; data classification noise; data mining; garbage in garbage out model; machine learning; noise handling method; Analytical models; Atmospheric modeling; Data models; Filtering algorithms; Noise; Noise measurement; Robustness; attribute noise; class noise; handling noise; noise model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-61284-485-5
Type :
conf
DOI :
10.1109/ICCSN.2011.6014597
Filename :
6014597
Link To Document :
بازگشت