• DocumentCode
    2022201
  • Title

    Energy-Based Models in Document Recognition and Computer Vision

  • Author

    LeCun, Yann ; Chopra, Sumit ; Ranzato, Marc´Aurelio ; Huang, Fu-Jie

  • Author_Institution
    New York Univ., New York
  • Volume
    1
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    337
  • Lastpage
    341
  • Abstract
    The machine learning and pattern recognition communities are facing two challenges: solving the normalization problem, and solving the deep learning problem. The normalization problem is related to the difficulty of training probabilistic models over large spaces while keeping them properly normalized. In recent years, the ML and natural language communities have devoted considerable efforts to circumventing this problem by developing "un-normalized" learning models for tasks in which the output is highly structured (e.g. English sentences). This class of models was in fact originally developed during the 90\´s in the handwriting recognition community, and includes graph transformer networks, conditional random fields, hidden Markov SVMs, and maximum margin Markov networks. We describe these models within the unifying framework of "energy-based models" (EBM). The deep learning problem is related to the issue of training all the levels of a recognition system (e.g. segmentation, feature extraction, recognition, etc) in an integrated fashion. We first consider " traditional" methods for deep learning, such as convolutional networks and back-propagation, and show that, although they produce very low error rates for handwriting and object recognition, they require many training samples. We show that using unsupervised learning to initialize the layers of a deep network dramatically reduces the required number of training samples, particularly for such tasks as the recognition of everyday objects at the category level.
  • Keywords
    backpropagation; computer vision; document image processing; handwriting recognition; natural language processing; unsupervised learning; backpropagation; computer vision; conditional random fields; deep learning problem; document recognition; energy-based models; graph transformer networks; hidden Markov SVM; machine learning; maximum margin Markov networks; natural language; normalization problem; pattern recognition; probabilistic models; unsupervised learning; Computer vision; Error analysis; Feature extraction; Handwriting recognition; Hidden Markov models; Machine learning; Markov random fields; Natural languages; Object recognition; Pattern recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4378728
  • Filename
    4378728