Title of article :
An empirical bias–variance analysis of DECORATE ensemble method at different training sample sizes
Author/Authors :
Chun-Xia Zhang، نويسنده , , Guan-Wei Wang&Jiang-She Zhang، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
Abstract :
DECORATE (Diverse Ensemble Creation by Oppositional Relabeling of Artificial Training Examples) is
a classifier combination technique to construct a set of diverse base classifiers using additional artificially
generated training instances. The predictions from the base classifiers are then integrated into one by the
mean combination rule. In order to gain more insight about its effectiveness and advantages, this paper
utilizes a large experiment to study the bias–variance analysis of DECORATE as well as some other widely
used ensemble methods (such as bagging, AdaBoost, random forest) at different training sample sizes. The
experimental results yield the following conclusions. For small training sets, DECORATE has a dominant
advantage over its rivals and its success is attributed to the larger bias reduction achieved by it than the other
algorithms. With increase in training data, AdaBoost benefits most and the bias reduced by it gradually
turns to be significant while its variance reduction is also medium. Thus, AdaBoost performs best with
large training samples. Moreover, random forest behaves always second best regardless of small or large
training sets and it is seen to mainly decrease variance while maintaining low bias. Bagging seems to be
an intermediate one since it reduces variance primarily.
Keywords :
classifier combination method , AdaBoost , bias–variance decomposition , training sample size , Random forest
Journal title :
JOURNAL OF APPLIED STATISTICS
Journal title :
JOURNAL OF APPLIED STATISTICS