Title :
Rough set and CART approaches to mining incomplete data
Author :
Grzymala-Busse, Jerzy W.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
Abstract :
Many data sets are incomplete, i.e., are affected by missing attribute values. In this paper, we report results of experiments on two approaches to missing attribute values. The first one is based on rough set theory and rule induction, the second one is the CART method that uses surrogate splits for handling missing attribute values and that generates decision trees. As follows from our experiments, both approaches are comparable in terms of an error rate. Thus, for a specific data set the best method of handling missing attribute values should be selected individually.
Keywords :
data mining; decision trees; rough set theory; CART method; data mining; decision tree generation; missing attribute values; rough set theory; rule induction; Approximation methods; Breast cancer; Data mining; Decision trees; Error analysis; Image segmentation; CART algorithm for decision tree generation; LERS data mining system; MLEM2 algorithm for rule induction; incomplete data sets; missing attribute values;
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2010 International Conference of
Conference_Location :
Paris
Print_ISBN :
978-1-4244-7897-2
DOI :
10.1109/SOCPAR.2010.5685860