Title :
Performance of Document Image OCR Systems for Recognizing Video Texts on Embedded Platform
Author :
Chattopadhyay, Tanushyam ; Sinha, Priyanka ; Biswas, Provat
Author_Institution :
Innovation Labs. Tata Consultancy Services Ltd., Kolkata, India
Abstract :
Market demand for an embedded realization of video OCR motivated the authors to exert an attempt to evaluate the performance of existing document image OCR techniques for the same. Thus authors have tried to port the open source OCR systems like GOCR and Tessaract on an embedded platform. But their performance on an embedded platform shows that the character level and word level recognition accuracy is quite unacceptable for video text. This paper compares two such open source OCR systems on Indian TV videos and proposes some techniques that can be used to improve the recognition accuracy from 62% to 93%. Moreover the challenges of porting those codes on an embedded platform is also analyzed in this paper.
Keywords :
embedded systems; optical character recognition; text analysis; video signal processing; GOCR; Indian TV video; Tessaract; character level recognition; document image OCR system; document image OCR technique; embedded platform; open source OCR system; video OCR; video text recognition; word level recognition; Accuracy; Character recognition; Engines; Optical character recognition software; Streaming media; Text recognition; ABBYY; Findreader; GOCR; OCR; Tesseract; video;
Conference_Titel :
Computational Intelligence and Communication Networks (CICN), 2011 International Conference on
Conference_Location :
Gwalior
Print_ISBN :
978-1-4577-2033-8
DOI :
10.1109/CICN.2011.131