DocumentCode :
3144942
Title :
Document image analysis with OCRopus
Author :
Shafait, Faisal
Author_Institution :
German Res. Center for Artificial Intell. (DFKI GmbH), Kaiserslautern, Germany
fYear :
2009
fDate :
14-15 Dec. 2009
Firstpage :
1
Lastpage :
6
Abstract :
Document image analysis is the field of converting paper documents into an editable electronic representation by performing optical character recognition (OCR). In recent years, there has been a tremendous amount of progress in the development of open source OCR systems. OCRopus is one of the leading open source document analysis system with a modular and pluggable architecture. This paper presents an overview of different steps involved in a document image analysis system and illustrates them with examples from OCRopus.
Keywords :
document image processing; optical character recognition; public domain software; software architecture; OCRopus; document image analysis; electronic representation; modular-pluggable architecture; open source document analysis system; optical character recognition; paper documents; Artificial intelligence; Books; Character recognition; Data structures; Image analysis; Image converters; Open source software; Optical character recognition software; Search engines; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multitopic Conference, 2009. INMIC 2009. IEEE 13th International
Conference_Location :
Islamabad
Print_ISBN :
978-1-4244-4872-2
Electronic_ISBN :
978-1-4244-4873-9
Type :
conf
DOI :
10.1109/INMIC.2009.5383078
Filename :
5383078
Link To Document :
بازگشت