DocumentCode :
1557307
Title :
A comparative analysis of methods for pruning decision trees
Author :
Esposito, Floriana ; Malerba, Donato ; Semeraro, Giovanni ; Kay, John A.
Author_Institution :
Dipartimento di Inf., Bari Univ.
Volume :
19
Issue :
5
fYear :
1997
fDate :
5/1/1997 12:00:00 AM
Firstpage :
476
Lastpage :
491
Abstract :
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a top-down approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune observed in each method
Keywords :
computational complexity; decision theory; learning systems; optimisation; trees (mathematics); computational complexity; decision tree pruning; grafting operators; machine learning; optimisation; reduced error pruning; top-down induction; Accuracy; Classification tree analysis; Computational complexity; Decision trees; Design for experiments; Machine learning; Medical diagnosis; Pattern recognition; Regression tree analysis; Testing;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/34.589207
Filename :
589207
Link To Document :
بازگشت