مرکز منطقه ای اطلاع رساني علوم و فناوري - Consequences of Variability in Classifier Performance Estimates

DocumentCode :

2208205

Title :

Consequences of Variability in Classifier Performance Estimates

Author :

Raeder, Troy ; Hoens, T. Ryan ; Chawla, Nitesh V.

fYear :

2010

fDate :

13-17 Dec. 2010

Firstpage :

421

Lastpage :

430

Abstract :

The prevailing approach to evaluating classifiers in the machine learning community involves comparing the performance of several algorithms over a series of usually unrelated data sets. However, beyond this there are many dimensions along which methodologies vary wildly. We show that, depending on the stability and similarity of the algorithms being compared, these sometimes-arbitrary methodological choices can have a significant impact on the conclusions of any study, including the results of statistical tests. In particular, we show that performance metrics and data sets used, the type of cross-validation employed, and the number of iterations of cross-validation run have a significant, and often predictable, effect. Based on these results, we offer a series of recommendations for achieving consistent, reproducible results in classifier performance comparisons.

Keywords :

learning (artificial intelligence); pattern classification; classifier performance estimation; machine learning; reproducibility; variability; classification; evaluation; reproducibility;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining (ICDM), 2010 IEEE 10th International Conference on

Conference_Location :

Sydney, NSW

ISSN :

1550-4786

Print_ISBN :

978-1-4244-9131-5

Electronic_ISBN :

1550-4786

Type :

conf

DOI :

10.1109/ICDM.2010.110

Filename :

5693996

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2208205