Title :
Confidence Interval for
Measure of Algorithm Performance Based on B
Author :
Yu Wang ; Jihong Li ; Yanfang Li ; Ruibo Wang ; Xingli Yang
Author_Institution :
Comput. Center of Shanxi Univ., Taiyuan, China
Abstract :
In studies on the application of machine learning such as Information Retrieval (IR), the focus is typically on the estimation of the F1 measure of algorithm performance. Approximate symmetrical confidence intervals constructed by the F1 value based on cross-validated L distribution are commonly used in the literature. However, theoretical analysis on the distribution of F1 values shows that such distribution is actually non-symmetrical. Thus, simply using symmetrical distribution to approximate non-symmetrical distribution may be inappropriate and may result in a low degree of confidence and long interval length for the confidence interval. In the present study, a non-symmetrical confidence interval of the F1 measure based on Beta prime distribution is constructed by using the F1 value computed based on the average confusion matrix of a blocked 3 x 2 cross-validation. Experimental results show that in most cases, our method has high degrees of confidence. With an acceptable degree of confidence, our method has a shorter interval length than the approximate symmetrical confidence intervals based on the blocked 3 x 2 and 5 x 2 cross-validated L distributions. The approximate symmetrical confidence interval based on the 10-fold cross-validated L distribution has the shortest interval length of the four confidence intervals but with low degrees of confidence in all cases. Taking these two factors into consideration, our method is recommended.
Keywords :
learning (artificial intelligence); matrix algebra; Beta prime distribution; F1 measure; average confusion matrix; blocked 3x2 cross-validation; confidence interval; machine learning; symmetrical distribution; Accuracy; Approximation algorithms; Estimation; Loss measurement; Machine learning algorithms; Standards; Training; Beta prime distribution; Blocked 3 ?? 2 cross-validation; F1 measure; confidence interval;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2014.2359667