DocumentCode :
2257048
Title :
Discretization of continuous-valued attributes in decision tree generation
Author :
Li, Wen-Liagn ; Yu, Rui-Hua ; Wang, Xi-Zhao
Author_Institution :
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
Volume :
1
fYear :
2010
fDate :
11-14 July 2010
Firstpage :
194
Lastpage :
198
Abstract :
Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper, we improve Fayyad´s discretization method which uses the average class entropy of candidate partitions to select boundaries for discretization. Our method can reduce the number of candidate boundaries further. Here we also propose a generalized splitting criterion for cut point selection and prove that the cut points are always on boundaries when using this criterion. Along with the formal proof, we present empirical results that the decision trees generated by using such criteria are similar on several datasets from the UCI Machine Learning Repository.
Keywords :
data handling; decision trees; entropy; Fayyad discretization; UCI machine learning repository; classification model; continuous-valued attributes; cut point selection; decision tree generation; generalized splitting criterion; Classification tree analysis; Entropy; Impurities; Indexes; Machine learning; Continuous-valued; Decision tree; Discretization; Splitting criterion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
Type :
conf
DOI :
10.1109/ICMLC.2010.5581069
Filename :
5581069
Link To Document :
بازگشت