• DocumentCode
    9671
  • Title

    Word Segmentation Method for Handwritten Documents based on Structured Learning

  • Author

    Jewoong Ryu ; Hyung Il Koo ; Nam Ik Cho

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Seoul Nat. Univ., Seoul, South Korea
  • Volume
    22
  • Issue
    8
  • fYear
    2015
  • fDate
    Aug. 2015
  • Firstpage
    1161
  • Lastpage
    1165
  • Abstract
    Segmentation of handwritten document images into text-lines and words is an essential task for optical character recognition. However, since the features of handwritten document are irregular and diverse depending on the person, it is considered a challenging problem. In order to address the problem, we formulate the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps. Even though many parameters are involved in our formulation, we estimate all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.
  • Keywords
    handwritten character recognition; image segmentation; natural language processing; optical character recognition; support vector machines; text analysis; ICDAR 2009/2013 handwriting segmentation databases; Indian languages; Latin-based languages; binary quadratic assignment problem; handwritten document features; handwritten document image segmentation; optical character recognition; pairwise correlations; parameter estimation; structured SVM framework; structured learning; text-lines; user-defined parameters; word segmentation method; word segmentation problem; writing styles; written languages; Correlation; Cost function; Databases; Image segmentation; Signal processing algorithms; Writing; Handwritten documents; structured SVM; word segmentation;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2015.2389852
  • Filename
    7004865