DocumentCode
3347843
Title
A gentle Hessian for efficient gradient descent
Author
Collobert, Ronan ; Bengio, Samy
Author_Institution
IDIAP, Martigny, Switzerland
Volume
5
fYear
2004
fDate
17-21 May 2004
Abstract
Several second-order optimization methods for gradient descent algorithms have been proposed over the years, but they usually need to compute the inverse of the Hessian of the cost function (or an approximation of this inverse) during training. In most cases, this leads to an O(n2) cost in time and space per iteration, where n is the number of parameters, which is prohibitive for large n. We propose instead a study of the Hessian before training. Based on a second order analysis, we show that a block-diagonal Hessian yields an easier optimization problem than a full Hessian. We also show that the condition of block-diagonality in common machine learning models can be achieved by simply selecting an appropriate training criterion. Finally, we propose a version of the SVM criterion applied to MLPs, which verifies the aspects highlighted in this second order analysis, but also yields very good generalization performance in practice, taking advantage of the margin effect. Several empirical comparisons on two benchmark datasets are given to illustrate this approach.
Keywords
Hessian matrices; gradient methods; learning (artificial intelligence); multilayer perceptrons; optimisation; support vector machines; MLP; SVM criterion; block-diagonal matrix; cost function; gentle Hessian matrix; gradient descent algorithms; inverse matrix; machine learning; multilayer perceptrons; second-order optimization methods; support vector machines; training criterion; Approximation algorithms; Cost function; Iterative algorithms; Machine learning; Multilayer perceptrons; Optimization methods; Performance analysis; Stochastic processes; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1327161
Filename
1327161
Link To Document