DocumentCode :
183247
Title :
Applications of Recurrent Neural Network Language Model in Offline Handwriting Recognition and Word Spotting
Author :
Nan Li ; Jinying Chen ; Huaigu Cao ; Bing Zhang ; Natarajan, Prem
Author_Institution :
Raytheon BBN Technol., Cambridge, MA, USA
fYear :
2014
fDate :
1-4 Sept. 2014
Firstpage :
134
Lastpage :
139
Abstract :
The recurrent neural network language model (RNNLM) is a discriminative, non-Markovian model that can capture long-span word history in natural language. It has been proved to be successful in automatic speech recognition and machine translation. In this work, we applied RNNLM to the n-best rescoring stage of the state-of-the-art BBN Byblos OCR (optical character recognition) system for handwriting recognition.1 With RNNLM scores as additional features, our system achieved significant improvement (p <; 0.001), a 3.5% relative reduction on OCR word error rate, compared with a high baseline that uses n-gram language model for rescoring. We have also developed a novel method to integrate the OCR n-best RNNLM scores into the word posterior probabilities in OCR confusion networks, which resulted in consistent observable improvements in word spotting for OCR´ed handwritten documents, as measured by both mean average precision (MAP) and detection-error tradeoff (DET) curves.
Keywords :
document image processing; handwriting recognition; language translation; optical character recognition; recurrent neural nets; word processing; BBN Byblos OCR; DET curves; MAP; OCR confusion networks; OCR handwritten documents; OCR word error rate; RNNLM scores; automatic speech recognition; detection-error tradeoff curves; long-span word history; machine translation; mean average precision; nonMarkovian model; offline handwriting recognition; optical character recognition system; recurrent neural network language model; word posterior probabilities; word spotting; Character recognition; Handwriting recognition; Hidden Markov models; Lattices; Optical character recognition software; Recurrent neural networks; Training; information retrieval; keyword search; optical character recognition; recurrent neural networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
ISSN :
2167-6445
Print_ISBN :
978-1-4799-4335-7
Type :
conf
DOI :
10.1109/ICFHR.2014.30
Filename :
6981009
Link To Document :
بازگشت