مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

3326196

Title :

Gradient descent fails to separate

Author :

Brady, M. ; Raghavan, R. ; Slawny, J.

Author_Institution :

Lockheed Res. & Dev., Palo Alto, CA, USA

fYear :

1988

fDate :

24-27 July 1988

Firstpage :

649

Abstract :

In the context of neural network procedures, it is proved that gradient descent on a surface defined by a sum of squared errors can fail to separate families of vectors. Each output is assumed to be a differentiable monotone transformation (typically the logistic) of a linear combination of inputs. Several examples are given of two families of vectors for which a linear combination exists that will serve to separate the two families. However, the minimum cost solution does not yield the desired combination. The examples include several cases where there are no local minima, as well as a one-layer system showing local minima with a large basin of attraction. In contrast to the perceptron convergence theorem, which proves that the perceptron architecture, there is no convergence theorem for gradient descent which would allow correct classification. The theorem disproves the presumption made in recent years, that barring local minima, gradient descent will find the best set of weights for a given problem.<>

Keywords :

neural nets; optimisation; differentiable monotone transformation; gradient descent; neural network; optimisation; squared errors; Neural networks; Optimization methods;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1988., IEEE International Conference on

Conference_Location :

San Diego, CA, USA

Type :

conf

DOI :

10.1109/ICNN.1988.23902

Filename :

23902

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3326196