DocumentCode :
928817
Title :
Polishing blemishes: issues in data correction
Author :
Teng, Choh Man
Author_Institution :
Inst. for Human & Machine Cognition, Pensacola, FL, USA
Volume :
19
Issue :
2
fYear :
2004
Firstpage :
34
Lastpage :
39
Abstract :
Data quality is crucial to any data analysis task. Many imperfection-handling techniques avoid overfitting or simply remove offending portions of the data. Polishing identifies blemishes in the data and makes corrections to retain and recover as much information as possible. When using information collected from channels susceptible to disturbances, data quality is a concern-especially when the primary objective is to assimilate and understand the data. Imperfections can arise from many sources, including transmission and bandwidth constraints, faults in sensor devices, irregularities in sampling, and transcription errors. An intuitive application that exemplifies handling data imperfections is the spell-checker. Developing such a spell-checker would require novel techniques for repairing data imperfections. We are exploring such techniques using a data correction method called polishing. Here, we compare polishing to two alternative approaches to handling data imperfections, focusing on how to evaluate and validate data correction mechanisms.
Keywords :
data analysis; data handling; data integrity; data mining; bandwidth constraint; data analysis task; data correction method; data imperfection-handling technique; data quality; spell-checker; transcription error; Bandwidth; Cognition; Data analysis; Data mining; Filtering; Humans; Intelligent sensors; Machine learning algorithms; Noise robustness; Sampling methods;
fLanguage :
English
Journal_Title :
Intelligent Systems, IEEE
Publisher :
ieee
ISSN :
1541-1672
Type :
jour
DOI :
10.1109/MIS.2004.1274909
Filename :
1274909
Link To Document :
بازگشت