DocumentCode :
2022878
Title :
Segmentation and validation of commercial documents logical structure
Author :
Matrakas, Miguel Diogenes ; Bortolozzi, Flávio
Author_Institution :
Dept. of Comput. Sci., Document Analysis & Recognition Lab., Prana, Brazil
fYear :
2000
fDate :
2000
Firstpage :
242
Lastpage :
246
Abstract :
The main objective of the work is to present an approach to extract and validate the logical structure from the images that compose a commercial document. The nearest neighbor rule algorithm was used for labeling the elements, and the Run Length Smoothing Algorithm (RLSA) was used to segment the image of a commercial document of the type letter, official letter or memo. The most common classes considered are: date, logotype, text body, signature, addressee, invocation and greeting. The labeling of the elements is accomplished using the nearest neighbor rule algorithm with a vector comprising 28 characteristics. The accomplished study presented a good result for the classification of elements on commercial documents. It was created and used a base composed of 283 images of commercial documents in 256 gray levels for document element classification
Keywords :
document image processing; hypermedia markup languages; image segmentation; text analysis; Run Length Smoothing Algorithm; commercial document logical structure; commercial documents; common classes; document element classification; document segmentation; document validation; gray levels; image segmentation; logotype; memo; nearest neighbor rule algorithm; official letter; text body; type letter; Costs; Image analysis; Labeling; Laboratories; Nearest neighbor searches; Postal services; Quality management; Smoothing methods; Text analysis; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology: Coding and Computing, 2000. Proceedings. International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
0-7695-0540-6
Type :
conf
DOI :
10.1109/ITCC.2000.844221
Filename :
844221
Link To Document :
بازگشت