Title :
Why error measures are sub-optimal for training neural network pattern classifiers
Author :
Hampshire, John B., II ; Kumar, B. V K Vijaya
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Pattern classifiers that are trained in a supervised fashion are typically trained with an error measure objective function such as mean-squared error (MSE) or cross-entropy (CE). These classifiers can in theory yield Bayesian discrimination, but in practice they often fail to do so. The authors explain why this happens and identify a number of characteristics that the optimal objective function for training classifiers must have. They show that classification figures of merit (CFMmono) possess these optimal characteristics, whereas error measures such as MSE and CE do not. The arguments are illustrated with a simple example in which a CFMmono-trained low-order polynomial neural network approximates Bayesian discrimination on a random scalar with the fewest number of training samples and the minimum functional complexity necessary for the task. A comparable MSE-trained net yields significantly worse discrimination on the same task
Keywords :
learning (artificial intelligence); neural nets; pattern recognition; Bayesian discrimination; classification figures of merit; error measure objective function; low-order polynomial neural network; minimum functional complexity; neural network pattern classifiers; optimal objective function; random scalar; statistical pattern recognition; Bayesian methods; Capacity planning; Computer errors; Electric variables measurement; Error analysis; Multilayer perceptrons; Neural networks; Polynomials; Testing; Training data;
Conference_Titel :
Neural Networks, 1992. IJCNN., International Joint Conference on
Conference_Location :
Baltimore, MD
Print_ISBN :
0-7803-0559-0
DOI :
10.1109/IJCNN.1992.227338