Title :
Gradient descent with fast gliding over flat regions: a first report
Author :
Kantabutra, Vitit ; Zheleva, Elena
Author_Institution :
Comput. Sci. Program, Idaho State Univ., Pocatello, ID, USA
Abstract :
We present a new type of error backpropagation gradient descent algorithm. In the new algorithm, unless the error is already very small, we move along quickly ("glide") in flat regions. This algorithm seems intuitively appealing because flat regions should be "safe" regions where the error doesn\´t usually change much over distance. Using a simple 2-layer, 3-neuron neural network that computes the XOR function as a test bed, we find that for small to moderate learning rates, this algorithm converges significantly faster than conventional backpropagation gradient descent with the same learning rate outside of flat regions. For example, at eta=0.5 the new algorithm converges about 3 times as fast as the conventional one. However, the new algorithm is riskier than the conventional one and tends to diverge at higher learning rates. While the new algorithm already has some practical value, it could be even more useful if the divergence problem can be solved. Some ideas that may lead to its solution are given at the end of the paper. This work represents an early attempt to conquer the vast flat regions in an error curve, turning the known properties of flat regions to our advantage.
Keywords :
backpropagation; gradient methods; neural nets; 2-layer 3-neuron neural network; XOR function; algorithm convergence; error backpropagation gradient descent algorithm; error curve; fast gliding; flat regions; gradient descent; small to moderate learning rates; Backpropagation algorithms; Computer errors; Computer networks; Computer science; Convergence; Educational institutions; Neural networks; Testing; Turning;
Conference_Titel :
IECON 02 [Industrial Electronics Society, IEEE 2002 28th Annual Conference of the]
Print_ISBN :
0-7803-7474-6
DOI :
10.1109/IECON.2002.1182909