DocumentCode :
3037032
Title :
Data mining using genetic programming: the implications of parsimony on generalization error
Author :
Cavaretta, Michael J. ; Chellapilla, Kumar
Author_Institution :
Comput. Aided Eng. Dept., Ford Motor Co., Dearborn, MI, USA
Volume :
2
fYear :
1999
fDate :
1999
Abstract :
A common data mining heuristic is, “when choosing between models with the same training error, less complex models should be preferred as they perform better on unseen data”. This heuristic may not always hold. In genetic programming a preference for less complex models is implemented as: (i) placing a limit on the size of the evolved program; (ii) penalizing more complex individuals, or both. The paper presents a GP-variant with no limit on the complexity of the evolved program that generates highly accurate models on a common dataset
Keywords :
computational complexity; data mining; generalisation (artificial intelligence); genetic algorithms; GP-variant; common dataset; data mining heuristic; generalization error; genetic programming; less complex models; program complexity; training error; unseen data; Computer aided engineering; Computer errors; Data mining; Decision trees; Genetics; Laboratories; Pattern recognition; Predictive models; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-5536-9
Type :
conf
DOI :
10.1109/CEC.1999.782602
Filename :
782602
Link To Document :
بازگشت