Title :
A Structural Sampling Technique for Better Decision Trees
Author_Institution :
Div. of Comput. & Inf. Eng., Dongseo Univ., Busan, South Korea
Abstract :
Since data mining problems contain a large amount of data, sampling is a necessity for the success of the task. Decision trees have been developed for prediction, and finding decision trees with smaller error rates has been a major task for their success. This paper suggests a structural sampling technique that is based on a generated decision tree, where the tree is generated based on fast and dirty tree generation algorithm. Experiments with several sample sizes and representative decision tree algorithms showed that the method is more effective with respect to decision tree size and error rate than conventional random sampling method especially for small sample size.
Keywords :
data mining; decision trees; sampling methods; data mining; data sampling; decision tree size; dirty tree generation algorithm; error rate; random sampling method; representative decision tree algorithm; structural sampling technique; Data mining; Decision trees; Deductive databases; Error analysis; Greedy algorithms; Induction generators; Intelligent structures; Machine learning; Sampling methods; Scalability; C4.5; CART; decision trees; sampling;
Conference_Titel :
Intelligent Information and Database Systems, 2009. ACIIDS 2009. First Asian Conference on
Conference_Location :
Dong Hoi
Print_ISBN :
978-0-7695-3580-7
DOI :
10.1109/ACIIDS.2009.24