• DocumentCode
    1749044
  • Title

    The need for small learning rates on large problems

  • Author

    Wilson, D. Randall ; Martinez, Tony R.

  • Author_Institution
    fonix Corp., Draper, UT, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    115
  • Abstract
    In gradient descent learning algorithms such as error backpropagation, the learning rate parameter can have a significant effect on generalization accuracy. In particular, decreasing the learning rate below that which yields the fastest convergence can significantly improve generalization accuracy, especially on large, complex problems. The learning rate also directly affects training speed, but not necessarily in the way that many people expect. Many neural network practitioners currently attempt to use the largest learning rate that still allows for convergence, in order to improve training speed. However, a learning rate that is too large can be as slow as a learning rate that is too small, and a learning rate that is too large or too small can require orders of magnitude more training time than one that is in an appropriate range. The paper illustrates how the learning rate affects training speed and generalization accuracy, and thus gives guidelines on how to efficiently select a learning rate that maximizes generalization accuracy
  • Keywords
    generalisation (artificial intelligence); learning (artificial intelligence); neural nets; error backpropagation; fastest convergence; generalization accuracy; gradient descent learning algorithms; large problems; learning rate parameter; small learning rates; training speed; Approximation algorithms; Computer errors; Computer science; Convergence; Guidelines; Neural networks; Nominations and elections;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-7044-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2001.939002
  • Filename
    939002