DocumentCode :
2219695
Title :
Evolving accurate and comprehensible classification rules
Author :
Sönströd, Cecilia ; Johansson, Ulf ; König, Rikard
Author_Institution :
Sch. of Bus. & Inf., Univ. of Boras, Boras, Sweden
fYear :
2011
fDate :
5-8 June 2011
Firstpage :
1436
Lastpage :
1443
Abstract :
In this paper, Genetic Programming is used to evolve ordered rule sets (also called decision lists) for a number of benchmark classification problems, with evaluation of both predictive performance and comprehensibility. The main purpose is to compare this approach to the standard decision list algorithm JRip and also to evaluate the use of different length penalties and fitness functions for evolving this type of model. The results, using 25 data sets from the UCI repository, show that genetic decision lists with accuracy-based fitness functions outperform JRip regarding accuracy. Indeed, the best setup was significantly better than JRip. JRip, however, held a slight advantage over these models when evaluating AUC. Furthermore, all genetic decision list setups produced models that were more compact than JRip models, and thus more readily comprehensible. The effect of using different fitness functions was very clear; in essence, models performed best on the evaluation criterion that was used in the fitness function, with a worsening of the performance for other criteria. Brier score fitness provided a middle ground, with acceptable performance on both accuracy and AUC. The main conclusion is that genetic programming solves the task of evolving decision lists very well, but that different length penalties and fitness functions have immediate effects on the results. Thus, these parameters can be used to control the trade-off between different aspects of predictive performance and comprehensibility.
Keywords :
data mining; genetic algorithms; knowledge based systems; learning (artificial intelligence); pattern classification; AUC; JRip models; UCI repository; accuracy-based fitness function; benchmark classification problem; comprehensible classification rule; fitness function; genetic decision lists; genetic programming; length penalty; ordered rule set; parameter control; predictive performance; standard decision list algorithm; Accuracy; Data models; Decision trees; Genetics; Partitioning algorithms; Predictive models; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation (CEC), 2011 IEEE Congress on
Conference_Location :
New Orleans, LA
ISSN :
Pending
Print_ISBN :
978-1-4244-7834-7
Type :
conf
DOI :
10.1109/CEC.2011.5949784
Filename :
5949784
Link To Document :
بازگشت