Document Recognition without Strong Models

Author

Baird, Henry S.

Author_Institution

Comput. Sci. & Eng. Dept., Lehigh Univ., Bethlehem, PA, USA

fYear

2011

fDate

18-21 Sept. 2011

Firstpage

414

Lastpage

423

Abstract

Can a high-performance document image recognition system be built without detailed knowledge of the application? Having benefited from the statistical machine learning revolution of the last twenty years, our architectures rely less on hand-crafted special-case rules and more on models trained on labeled-sample data sets. But urgent questions remain. When we can\´t collect (and label) enough real training data, does it help to complement them with data synthesized using generative models? Is it ever completely safe to rely on synthetic data? If we can\´t manage to train (or craft) a single complete, near-perfect, application-specific "strong" model to drive recognition, can we make progress by combining several imperfect or incomplete "weak" models? Can recognition that is carried out jointly over weak models perform optimally while still running fast? Can a recognizer automatically pick a strong model of its input? Must we always pre-train models for every kind ("style") of input expected, or can a recognizer adapt to unknown styles? Can weak models adapt autonomously, growing stronger and so driving accuracy higher, without any human intervention? Can one model "criticize" - and then proceed to correct - other models, even while it is being criticized and corrected in turn by them? After twenty-five years of research on these questions we have partial answers, many in the affirmative: in addition to promising laboratory demonstrations, we can take pride in successful applications. I\´ll illustrate the evolution of the state of the art with concrete examples, and point out open problems.

Keywords

document image processing; image recognition; learning (artificial intelligence); set theory; data set; drive recognition; hand crafted special case rule; high performance document image recognition system; human intervention; pretrain model; real training data synthesis; statistical machine learning revolution; Accuracy; Adaptation models; Data models; Games; Humans; Image recognition; Semantics; Rule-based recognition; adaptive recognition; anytime algorithms; joint recognition; learning models; model-driven recognition; strong versus weak models; style-conscious recognition; synthetically generated data; whole-book recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Document Analysis and Recognition (ICDAR), 2011 International Conference on

Conference_Location

Beijing

ISSN

1520-5363

Print_ISBN

978-1-4577-1350-7

Electronic_ISBN

1520-5363

Type

conf

DOI

10.1109/ICDAR.2011.91

Filename

6065346