DocumentCode :
872656
Title :
Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming
Author :
Vladislavleva, Ekaterina J. ; Smits, Guido F. ; Den Hertog, Dick
Author_Institution :
Dept. of Econ. & Oper. Res., Tilburg Univ., Tilburg
Volume :
13
Issue :
2
fYear :
2009
fDate :
4/1/2009 12:00:00 AM
Firstpage :
333
Lastpage :
349
Abstract :
This paper presents a novel approach to generate data-driven regression models that not only give reliable prediction of the observed data but also have smoother response surfaces and extra generalization capabilities with respect to extrapolation. These models are obtained as solutions of a genetic programming (GP) process, where selection is guided by a tradeoff between two competing objectives - numerical accuracy and the order of nonlinearity. The latter is a novel complexity measure that adopts the notion of the minimal degree of the best-fit polynomial, approximating an analytical function with a certain precision. Using nine regression problems, this paper presents and illustrates two different strategies for the use of the order of nonlinearity in symbolic regression via GP. The combination of optimization of the order of nonlinearity together with the numerical accuracy strongly outperforms ldquoconventionalrdquo optimization of a size-related expressional complexity and the accuracy with respect to extrapolative capabilities of solutions on all nine test problems. In addition to exploiting the new complexity measure, this paper also introduces a novel heuristic of alternating several optimization objectives in a 2-D optimization framework. Alternating the objectives at each generation in such a way allows us to exploit the effectiveness of 2-D optimization when more than two objectives are of interest (in this paper, these are accuracy, expressional complexity, and the order of nonlinearity). Results of the experiments on all test problems suggest that alternating the order of nonlinearity of GP individuals with their structural complexity produces solutions that are both compact and have smoother response surfaces, and, hence, contributes to better interpretability and understanding.
Keywords :
computational complexity; extrapolation; genetic algorithms; regression analysis; Pareto genetic programming; best-fit polynomial; data-driven regression models; extrapolation; nonlinearity order; symbolic regression; Complexity; evolutionary multiobjective optimization; extrapolation; genetic programming (GP); industrial data analysis; model selection;
fLanguage :
English
Journal_Title :
Evolutionary Computation, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-778X
Type :
jour
DOI :
10.1109/TEVC.2008.926486
Filename :
4632147
Link To Document :
بازگشت