Title :
Nearest neighbor ensemble
Author :
Domeniconi, Carlotta ; Yan, Bojun
Author_Institution :
Dept. of Inf. & Software Eng., George Mason Univ., Fairfax, VA, USA
Abstract :
Recent empirical work has shown that combining predictors can lead to significant reduction in generalization error. The individual predictors (weak learners) can be very simple, such as two terminal-node trees; it is the aggregating scheme that gives them the power of increasing prediction accuracy. Unfortunately, many combining methods do not improve nearest neighbor (NN) classifiers at all. This is because NN methods are very robust with respect to variations of a data set. In contrast, they are sensitive to input features. We exploit the instability of NN classifiers with respect to different choices of features to generate an effective and diverse set of NN classifiers with possibly uncorrelated errors. Interestingly, the approach takes advantage of the high dimensionality of the data. The experimental results show that our technique offers significant performance improvements with respect to competitive methods.
Keywords :
error statistics; learning (artificial intelligence); pattern classification; trees (mathematics); generalization error reduction; instability; nearest neighbor classifiers; training data set; two terminal node trees; uncorrelated errors; Accuracy; Bagging; Boosting; Computer errors; Nearest neighbor searches; Neural networks; Robustness; Sampling methods; Software engineering; Voting;
Conference_Titel :
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
Print_ISBN :
0-7695-2128-2
DOI :
10.1109/ICPR.2004.1334065