DocumentCode :
2353504
Title :
Bagging is a small-data-set phenomenon
Author :
Chawla, N. ; Moore, T.E., Jr. ; Bowyer, K.W. ; Hall, L.O. ; Springer, C. ; Kegelmeyer, P.
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
Volume :
2
fYear :
2001
fDate :
8-14 Dec. 2001
Abstract :
Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments on various datasets show that, given the same size partitions and bags, disjoint partitions result in better performance than bootstrap aggregates (bags). Many applications (e.g., protein structure prediction) involve the use of datasets that are too large to handle in the memory of a typical computer. Our results indicate that, in such applications, the simple approach of creating a committee of classifiers from disjoint partitions is preferred over the more complex approach of bagging.
Keywords :
data mining; learning (artificial intelligence); pattern classification; bagging; bootstrap aggregation; classifier committee; disjoint partitions; protein structure prediction; small dataset; training data pool; training sets; Aggregates; Application software; Bagging; Computer science; Data mining; Laboratories; Proteins; Sampling methods; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on
Conference_Location :
Kauai, HI, USA
ISSN :
1063-6919
Print_ISBN :
0-7695-1272-0
Type :
conf
DOI :
10.1109/CVPR.2001.991030
Filename :
991030
Link To Document :
بازگشت