DocumentCode
1994012
Title
Numerical sequence extraction in handwritten incoming mail documents
Author
Koch, G. ; Heutte, L. ; Paquet, T.
Author_Institution
Lab. PSI, Univ. de Rouen, France
fYear
2003
fDate
3-6 Aug. 2003
Firstpage
369
Abstract
In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label to each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize/extract fields of interest. Reported results on the extraction of zip codes, phone numbers and customer codes from handwritten incoming mail documents demonstrate the interest of the proposed approach.
Keywords
computational linguistics; feature extraction; handwritten character recognition; hidden Markov models; mailing systems; HMM based syntactic analyzer; automatic extraction; contextual morphological features; customer codes; handwritten documents; handwritten incoming mail documents; numerical fields; numerical sequence extraction; phone numbers; syntactic structure; zip codes; Context; Data mining; Dispatching; Face recognition; Filters; Handwriting recognition; Hidden Markov models; Labeling; Postal services; Text processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN
0-7695-1960-1
Type
conf
DOI
10.1109/ICDAR.2003.1227691
Filename
1227691
Link To Document