Title :
Robust rule-based prediction
Author_Institution :
Dept. of Mathematics & Comput., Univ. of Southern Queensland, Toowoomba, Qld.
Abstract :
This paper studies a problem of robust rule-based classification, i.e., making predictions in the presence of missing values in data. This study differs from other missing value handling research in that it does not handle missing values but builds a rule-based classification model to tolerate missing values. Based on a commonly used rule-based classification model, we characterize the robustness of a hierarchy of rule sets as k-optimal rule sets with the decreasing size corresponding to the decreasing robustness. We build classifiers based on k-optimal rule sets and show experimentally that they are more robust than some benchmark rule-based classifiers, such as C4.5rules and CBA. We also show that the proposed approach is better than two well-known missing value handling methods for missing values in test data
Keywords :
data handling; data mining; pattern classification; k-optimal rule sets; missing value handling method; rule-based classification; Benchmark testing; Costs; Data mining; Decision trees; Humidity; Performance evaluation; Rain; Robustness; System testing; Data mining; classification; robustness.; rule;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2006.129