DocumentCode :
3195416
Title :
Error-based pruning of decision trees grown on very large data sets can work!
Author :
Hall, L.O. ; Collins, R. ; Bowyer, K.W. ; Banfield, R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
fYear :
2002
fDate :
4-6 Nov. 2002
Firstpage :
233
Lastpage :
238
Abstract :
It has been asserted that, using traditional pruning methods, growing decision trees with increasingly larger amounts of training data will result in larger tree sizes even when accuracy does not increase. With regard to error-based pruning, the experimental data used to illustrate this assertion have apparently been obtained using the default setting for pruning strength; in particular, using the default certainty factor of 25 in the C4.5 decision tree implementation. We show that, in general, an appropriate setting of the certainty factor for error-based pruning will cause decision tree size to plateau when accuracy is not increasing with more training data.
Keywords :
decision trees; C4.5 decision tree implementation; error-based decision tree pruning; very large data sets; Computer errors; Computer science; Decision trees; Machine learning; Performance evaluation; Stability; Tellurium; Testing; Training data; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings. 14th IEEE International Conference on
Conference_Location :
Washington, DC, USA
ISSN :
1082-3409
Print_ISBN :
0-7695-1849-4
Type :
conf
DOI :
10.1109/TAI.2002.1180809
Filename :
1180809
Link To Document :
بازگشت