Title :
A Novel Word Spotting Method Based on Recurrent Neural Networks
Author :
Frinken, Volkmar ; Fischer, Andreas ; Manmatha, R. ; Bunke, Horst
Author_Institution :
Inst. of Comput. Sci. & Appl. Math. (IAM), Univ. of Bern, Bern, Switzerland
Abstract :
Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.
Keywords :
document image processing; handwriting recognition; hidden Markov models; recurrent neural nets; CTC token passing algorithm; handwriting recognition; handwritten documents; hidden Markov models; keyword spotting; novel word spotting method; recurrent neural networks; text line recognition; Artificial neural networks; Documentation; Feature extraction; Handwriting recognition; Hidden Markov models; Image segmentation; Indexes; Neural networks; BLSTM.; Index TermsKeyword spotting; document analysis; historical documents; neural network; offline handwriting;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
DOI :
10.1109/TPAMI.2011.113