DocumentCode :
3549146
Title :
On the small sample performance of Boosted classifiers
Author :
Li, Weiliang ; Gao, Xiang ; Zhu, Ying ; Ramesh, Visvanathan ; Boult, Terrance E.
Author_Institution :
Dept. of Comput. Sci. & Eng., Lehigh Univ., Bethlehem, PA, USA
Volume :
2
fYear :
2005
fDate :
20-25 June 2005
Firstpage :
574
Abstract :
Boosting algorithms have been widely applied in the machine vision systems. Two fundamental issues that have to be solved in these systems are how much training data and how many Boosting rounds are needed to achieve a desired performance. We view the Boosting algorithm as a nonlinear estimation scheme that estimates a strong classifier from a given training sample set (that is generated by sampling a true unknown distribution), the weak classifiers, and the number of Boosting rounds T. The performance characterization of this estimator involves the derivation of the classification error statistics of the trained strong classifier as a function of the training set and the collection of the weak classifiers. Although the convergence and the error bounds for the training error and generalization error of the algorithms have been studied for several years, the estimated bounds are still loose bounds that are only meaningful for large training sets. With no effective tools for determining the error bounds, users are now collecting training samples with as much data as they can afford, with no good way to know if they are sufficient. In this paper, we characterize the classification error statistics of the trained strong classifier as a function of the true distributions of classes, the collection of the weak classifiers, and the size of the training set. We show that the statistics can be numerically computed and the results are more accurate than previous bounds in the literature. Theoretical results are verified through the simulations. Face detection is used as a case study to illustrate the application of the theory on real data.
Keywords :
computer vision; convergence; error statistics; estimation theory; face recognition; generalisation (artificial intelligence); image classification; learning (artificial intelligence); Boosting algorithms; convergence; error statistics; generalization error; image classification; machine vision system; nonlinear estimation scheme; Boosting; Character generation; Computational modeling; Convergence; Error analysis; Face detection; Machine vision; Sampling methods; Statistical distributions; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
ISSN :
1063-6919
Print_ISBN :
0-7695-2372-2
Type :
conf
DOI :
10.1109/CVPR.2005.258
Filename :
1467493
Link To Document :
بازگشت