DocumentCode :
1755818
Title :
Practical Ensemble Classification Error Bounds for Different Operating Points
Author :
Varshney, Kush R. ; Prenger, Ryan J. ; Marlatt, Tracy L. ; Chen, Brian Y. ; Hanley, William G.
Author_Institution :
Bus. Analytics & Math. Sci. Dept., IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
25
Issue :
11
fYear :
2013
fDate :
Nov. 2013
Firstpage :
2590
Lastpage :
2601
Abstract :
Classification algorithms used to support the decisions of human analysts are often used in settings in which zero-one loss is not the appropriate indication of performance. The zero-one loss corresponds to the operating point with equal costs for false alarms and missed detections, and no option for the classifier to leave uncertain test samples unlabeled. A generalization bound for ensemble classification at the standard operating point has been developed based on two interpretable properties of the ensemble: strength and correlation, using the Chebyshev inequality. Such generalization bounds for other operating points have not been developed previously and are developed in this paper. Significantly, the bounds are empirically shown to have much practical utility in determining optimal parameters for classification with a reject option, classification for ultralow probability of false alarm, and classification for ultralow probability of missed detection. Counter to the usual guideline of large strength and small correlation in the ensemble, different guidelines are recommended by the derived bounds in the ultralow false alarm and missed detection probability regimes.
Keywords :
Chebyshev approximation; data handling; pattern classification; probability; Chebyshev inequality; different operating points; missed detection; optimal parameters; practical ensemble classification error bounds; standard operating point; ultralow probability; uncertain test samples; zero one loss; Chebyshev approximation; Correlation; Guidelines; Humans; Receivers; Standards; Terrorism; Cantelli inequality; random forests; receiver operating characteristic; reject option;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2012.219
Filename :
6378369
Link To Document :
بازگشت