Title :
Exploiting data preparation to enhance mining and knowledge discovery
Author :
Rajagopalan, Balaji ; Isken, Mark W.
Author_Institution :
Dept. of Decision & Inf. Sci., Oakland Univ., Rochester, MI, USA
fDate :
11/1/2001 12:00:00 AM
Abstract :
One of the major obstacles to using organizational data for mining and knowledge discovery is that, in most cases, it is not amenable for mining in its natural form. Using a data set from a large tertiary-care hospital, we provide strong empirical evidence that data enhancement by the introduction of new attributes, along with judicious aggregation of existing attributes, results in higher-quality knowledge discovery. Interestingly, we also found that there is a differential impact of data set enhancements on the performance of different data mining algorithms. We define and use several measures, including entropy, rule complexity and resonance, to evaluate the quality and usefulness of the knowledge discovered
Keywords :
data mining; data preparation; entropy; health care; medical information systems; resonance; algorithm performance; attribute aggregation; attribute introduction; clustering methods; data enhancement; data mining; data preparation; data set enhancements; differential impact; entropy; health care; knowledge discovery; knowledge engineering; knowledge quality; knowledge usefulness; resonance; rule complexity; tertiary care hospital; Data mining; Data warehouses; Entropy; Hospitals; Information technology; Internet; Knowledge engineering; Knowledge management; Medical services; Resonance;
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
DOI :
10.1109/5326.983929