DocumentCode
10862
Title
A Note on Generalization Loss When Evolving Adaptive Pattern Recognition Systems
Author
Igel, Christian
Author_Institution
Dept. of Comput. Sci., Univ. of Copenhagen, Copenhagen, Denmark
Volume
17
Issue
3
fYear
2013
fDate
Jun-13
Firstpage
345
Lastpage
352
Abstract
Evolutionary computing provides powerful methods for designing pattern recognition systems. This design process is typically based on finite sample data and therefore bears the risk of overfitting. This paper aims at raising the awareness of various types of overfitting and at providing guidelines for how to deal with them. We restrict our considerations to the predominant scenario in which fitness computations are based on point estimates. Three different sources of losing generalization performance when evolving learning machines, namely overfitting to training, test, and final selection data, are identified, discussed, and experimentally demonstrated. The importance of a pristine hold-out data set for the selection of the final result from the evolved candidates is highlighted. It is shown that it may be beneficial to restrict this last selection process to a subset of the evolved candidates.
Keywords
evolutionary computation; learning (artificial intelligence); pattern recognition; risk analysis; training; adaptive pattern recognition systems; design process; evolutionary computing; final selection data; finite sample data; learning machines; overfitting risk; point estimate-based fitness computations; predominant scenario; pristine hold-out data set; Adaptive systems; Algorithm design and analysis; Machine learning; Pattern recognition; Strain; Training; Training data; Evolutionary learning; machine learning; model selection; overfitting; pattern recognition;
fLanguage
English
Journal_Title
Evolutionary Computation, IEEE Transactions on
Publisher
ieee
ISSN
1089-778X
Type
jour
DOI
10.1109/TEVC.2012.2197214
Filename
6193424
Link To Document