DocumentCode
344192
Title
A new method of character line extraction from mixed-unformatted document image for Japanese mail address recognition
Author
Wang, Xian ; Tsutsumida, Toshio
Author_Institution
Center of Excellence for Document Analysis & Recognition, State Univ. of New York, Buffalo, NY, USA
fYear
1999
fDate
20-22 Sep 1999
Firstpage
769
Lastpage
772
Abstract
Presents a new method of horizontal and vertical character line extraction in mixed (handwritten/printed) unformatted document images, in various character sizes, gaps and orientations nested among advertisement characters, drawings and photographs. We use the inherent features of a character line, such as the number and size of the characters it contains and the angular spectrum of the characters. When an area has characters along both horizontal and vertical lines, then competitive judgment is applied. Using multi-set thresholds in a bottom-up methodology, we can successfully extract Japanese mail address character lines. 957 address character lines, taken from 252 pieces of mail, were tested, and a 95.9% correct extraction rate was achieved
Keywords
document image processing; image segmentation; mailing systems; Japanese mail address recognition; advertisements; bottom-up methodology; character angular spectrum; character line extraction; character line inherent features; character number; character orientation; character size; competitive judgment; drawings; handwritten documents; horizontal character lines; inter-character gap; mixed unformatted document images; multi-set thresholds; photographs; printed documents; vertical character lines; Character recognition; Electronic switching systems; Image analysis; Image recognition; Merging; Postal services; Read only memory; Seals; Testing; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location
Bangalore
Print_ISBN
0-7695-0318-7
Type
conf
DOI
10.1109/ICDAR.1999.791901
Filename
791901
Link To Document