Title :
Comparing naive Bayes, decision trees, and SVM with AUC and accuracy
Author :
Huang, Jin ; Lu, Jingjing ; Ling, Charles X.
Author_Institution :
Dept. of Comput. Sci., Univ. of Western Ontario, London, Ont., Canada
Abstract :
Predictive accuracy has often been used as the main and often only evaluation criterion for the predictive performance of classification or data mining algorithms. In recent years, the area under the ROC (receiver operating characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating performance of learning algorithms. We proved that AUC is, in general, a better measure (defined precisely) than accuracy. Many popular data mining algorithms should then be reevaluated in terms of AUC. For example, it is well accepted that Naive Bayes and decision trees are very similar in accuracy. How do they compare in AUC? Also, how does the recently developed SVM (support vector machine) compare to traditional learning algorithms in accuracy and AUC? We will answer these questions. Our conclusions will provide important guidelines in data mining applications on real-world datasets.
Keywords :
Bayes methods; data mining; decision trees; learning (artificial intelligence); performance evaluation; sensitivity analysis; support vector machines; Naive Bayes method; ROC; SVM; accuracy prediction; data mining algorithm; decision trees; performance evaluation; receiver operating characteristics; support vector machine; Accuracy; Area measurement; Computer science; Data mining; Decision trees; Guidelines; Machine learning; Machine learning algorithms; Support vector machine classification; Support vector machines;
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
DOI :
10.1109/ICDM.2003.1250975