• DocumentCode
    2147706
  • Title

    Case Study in Hebrew Character Searching

  • Author

    Rabaev, Irina ; Biller, Ofer ; El-Sana, Jihad ; Kedem, Klara ; Dinstein, Itshak

  • Author_Institution
    Dept. of Comput. Sci., Ben-Gurion Univ., Beer-Sheva, Israel
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    1080
  • Lastpage
    1084
  • Abstract
    Searching for a letter or a word in historical documents is a practical challenge due to the various degradations present in such documents and the wide variance of handwriting. Searching in historical Hebrew documents is somewhat harder because of high similarities among Hebrew characters. In order to determine the features and their combinations appropriate for recognizing Hebrew script, we study a range of known features using a Dynamic Time Warping algorithm. In addition we describe a novel meth od for feature-based searching, which uses a number of models for the same character. This method is based on our original DTW algorithm that can match fragments of several models of the same character to match a query character. Consequently, we are not limited to any particular model of the character set. Application of this method leads to a significant improvement, even when using a small set of models.
  • Keywords
    character recognition; document image processing; handwriting recognition; DTW algorithm; Hebrew character searching; dynamic time warping algorithm; feature-based searching; handwriting; historical Hebrew document; Educational institutions; Euclidean distance; Feature extraction; Heuristic algorithms; Hidden Markov models; Text analysis; Vectors; Hebrew historical documents; character searching; dynamic time warping; variational method; word spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.218
  • Filename
    6065476