• DocumentCode
    3643602
  • Title

    Efficient Levenberg-Marquardt minimization of the cross-entropy error function

  • Author

    Amar Sarić;Jing Xiao

  • fYear
    2011
  • fDate
    7/1/2011 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    The Levenberg-Marquardt algorithm is one of the most common choices for training medium-size artificial neural networks. Since it was designed to solve nonlinear least-squares problems, its applications to the training of neural networks have so far typically amounted to using simple regression even for classification tasks. However, in this case the cross-entropy function, which corresponds to the maximum likelihood estimate of the network weights when the sigmoid or softmax activation function is used in the output layer, is the natural choice of the error function and a convex function of the weights in the output layer. It is an important property which leads to a more robust convergence of any descent-based training method. By constructing and implementing a modified version of the Levenberg-Marquardt algorithm suitable for minimizing the cross-entropy function, we aim to close a gap in the existing literature on neural networks. Additionally, as using the cross-entropy error measure results in one single error value per training pattern, our approach results in lower memory requirements for multi-valued classification problems when compared to the direct application of the algorithm.
  • Keywords
    "Training","Minimization","Jacobian matrices","Mathematical model","Equations","Vectors","Classification algorithms"
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks (IJCNN), The 2011 International Joint Conference on
  • ISSN
    2161-4393
  • Print_ISBN
    978-1-4244-9635-8
  • Electronic_ISBN
    2161-4407
  • Type

    conf

  • DOI
    10.1109/IJCNN.2011.6033191
  • Filename
    6033191