Title :
Improving the random forest algorithm by randomly varying the size of the bootstrap samples
Author_Institution :
Sch. of Comput. & Math., Charles Strut Univ., Bathurst, NSW, Australia
Abstract :
The Random Forest algorithm generates quite diverse decision trees as the base classifiers by applying the Random Subspace algorithm on the bootstrap samples for high dimensional datasets. However, for low dimensional datasets the diversity among the trees falls sharply for the Random Forest algorithm. To increase the ensemble accuracy by inducing more diversity among the decision trees we take a different approach. In Random Forest, the size of the bootstrap files remains the same every time to generate a decision tree as the base classifier. We propose to vary the size of the bootstrap samples randomly within a predefined range in order to increase the forest accuracy. We conduct an elaborate experimentation on several different datasets from UCI Machine Learning Repository. The experimental results show the worthiness of our proposed technique.
Keywords :
decision trees; learning (artificial intelligence); pattern classification; UCI machine learning repository; base classifiers; bootstrap files; bootstrap samples; decision trees; high dimensional datasets; low dimensional datasets; random forest algorithm; random subspace algorithm; Accuracy; Classification algorithms; Decision trees; Ionosphere; Prediction algorithms; Training; Vegetation; bootstrap samples; decision forest; decision tree; prediction accuracy; random forest;
Conference_Titel :
Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on
DOI :
10.1109/IRI.2014.7051904