DocumentCode
1967489
Title
Empirical case studies in attribute noise detection
Author
Khoshgoftaar, Taghi M. ; Hulse, Jason Van
Author_Institution
Dept. of Comput. Eng., Florida Atlantic Univ., Boca Raton, FL, USA
fYear
2005
fDate
15-17 Aug. 2005
Firstpage
211
Lastpage
216
Abstract
The problem of determining the noisiest attribute(s) from a set of domain-specific attributes is of practical importance to domain experts and the data mining community. Data noise is generally of two types: attribute noise and mislabeling errors (class noise). For a given domain-specific dataset, attributes that contain a significant amount of noise can have a detrimental impact on the success of a data mining initiative, e.g., reducing the predictive ability of a classifier in a supervised learning task. Techniques that provide information about the noise quality of an attribute are useful tools for a data mining practitioner when performing analysis on a dataset or scrutinizing the data collection processes. Our technique for detecting noisy attributes uses an algorithm that we recently proposed for the detection of instances with attribute noise. This paper presents case studies that confirm our recent work done on detecting noisy attributes and further validates that our technique is indeed able to detect attributes that contain noise.
Keywords
data mining; database management systems; attribute noise; attribute noise detection; class noise; data collection process; data mining; domain experts; domain-specific attributes; domain-specific dataset; mislabeling errors; supervised learning task; Computer aided software engineering; Computer errors; Computer science; Data analysis; Data engineering; Data mining; Information analysis; Noise reduction; Performance analysis; Supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration, Conf, 2005. IRI -2005 IEEE International Conference on.
Print_ISBN
0-7803-9093-8
Type
conf
DOI
10.1109/IRI-05.2005.1506475
Filename
1506475
Link To Document