DocumentCode
3433651
Title
Morphological tagging approach in document analysis of invoices
Author
Belaïd, Y. ; Belaïd, A.
Author_Institution
LORIA, Nancy II Univ., France
Volume
1
fYear
2004
fDate
23-26 Aug. 2004
Firstpage
469
Abstract
A morphological tagging approach for document image invoice analysis is described. Tokens close by their morphology and confirmed in their location within different similar contexts make apparent some parts of speech representative of the structure elements. This bottom up approach avoids the use of an priori knowledge provided that there are redundant and frequent contexts in the text. The approach is applied on the invoice body text roughly recognized by OCR and automatically segmented. The method makes possible the detection of the invoice articles and their different fields. The regularity of the article composition and its redundancy in the invoice is a good help for its structure. The recognition rate of 276 invoices and 1704 articles, is over than 91.02% for articles and 92.56% for fields.
Keywords
document image processing; image segmentation; invoicing; optical character recognition; text analysis; OCR; article composition regularity; document image invoice analysis; frequent context; invoice articles detection; invoice body text; morphological tagging approach; parts of speech; redundant context; structure elements; Image analysis; Morphology; Natural languages; Optical character recognition software; Pattern recognition; Redundancy; Speech; Tagging; Text analysis; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
ISSN
1051-4651
Print_ISBN
0-7695-2128-2
Type
conf
DOI
10.1109/ICPR.2004.1334166
Filename
1334166
Link To Document