DocumentCode :
2774426
Title :
An Efficient Decision Tree Construction for Large Datasets
Author :
Uyen Nguyen Thi Van ; Chung, Tae Choong
Author_Institution :
KyungHee Univ., Seoul
fYear :
2007
fDate :
18-20 Nov. 2007
Firstpage :
21
Lastpage :
25
Abstract :
In this paper, we propose a new data structure and a new framework of building decision tree classifiers that is especially suitable for large datasets. The most prominent feature of our algorithm is that in order to build a decision tree, only one scan over the entire database is needed. Compared with previous methods, where at each level of the tree one scan over the whole database is made, our algorithm is obviously much more efficient. Moreover, our algorithm provides onetime sort process for numeric attributes, which significantly reduces the sorting cost and hence the whole execution time. The experimental results show that our algorithm outperforms the RainForest algorithm - a well-known and efficient algorithm for decision tree construction - in time dimension. This proves that our algorithm can be applied into large datasets efficiently.
Keywords :
decision trees; pattern classification; tree data structures; data structure; decision tree classifiers; decision tree construction; Buildings; Classification tree analysis; Costs; Data structures; Decision trees; Partitioning algorithms; Sorting; Spatial databases; Training data; Tree data structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovations in Information Technology, 2007. IIT '07. 4th International Conference on
Conference_Location :
Dubai
Print_ISBN :
978-1-4244-1840-4
Electronic_ISBN :
978-1-4244-1841-1
Type :
conf
DOI :
10.1109/IIT.2007.4430464
Filename :
4430464
Link To Document :
بازگشت