DocumentCode :
2771272
Title :
Performance of Document Image OCR Systems for Recognizing Video Texts on Embedded Platform
Author :
Chattopadhyay, Tanushyam ; Sinha, Priyanka ; Biswas, Provat
Author_Institution :
Innovation Labs. Tata Consultancy Services Ltd., Kolkata, India
fYear :
2011
fDate :
7-9 Oct. 2011
Firstpage :
606
Lastpage :
610
Abstract :
Market demand for an embedded realization of video OCR motivated the authors to exert an attempt to evaluate the performance of existing document image OCR techniques for the same. Thus authors have tried to port the open source OCR systems like GOCR and Tessaract on an embedded platform. But their performance on an embedded platform shows that the character level and word level recognition accuracy is quite unacceptable for video text. This paper compares two such open source OCR systems on Indian TV videos and proposes some techniques that can be used to improve the recognition accuracy from 62% to 93%. Moreover the challenges of porting those codes on an embedded platform is also analyzed in this paper.
Keywords :
embedded systems; optical character recognition; text analysis; video signal processing; GOCR; Indian TV video; Tessaract; character level recognition; document image OCR system; document image OCR technique; embedded platform; open source OCR system; video OCR; video text recognition; word level recognition; Accuracy; Character recognition; Engines; Optical character recognition software; Streaming media; Text recognition; ABBYY; Findreader; GOCR; OCR; Tesseract; video;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Communication Networks (CICN), 2011 International Conference on
Conference_Location :
Gwalior
Print_ISBN :
978-1-4577-2033-8
Type :
conf
DOI :
10.1109/CICN.2011.131
Filename :
6112941
Link To Document :
بازگشت