Title :
Text string extraction from images of colour-printed documents
Author :
Suen, H.-M. ; Wang, J.-F.
Author_Institution :
Inst. of Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
fDate :
8/1/1996 12:00:00 AM
Abstract :
Given the mass of printed documents today, an automated entry system is highly desirable. Many techniques focusing on processing monochrome documents have been proposed in the past years but few techniques have been proposed for dealing with colour-printed documents. The authors discuss the processing of colour-printed documents in 24-bit true colour images and propose an approach for extracting text strings from them. Due to the very large amount of data in a 24-bit true colour image, processing is usually very time consuming. To reduce the computational complexity and thus speed up processing, the original colour image is first transformed into a binary image of edge representation for page segmentation. Then a new method is used to identify the text blocks are transformed into white-background/black-text binary images for an OCR system. The proposed approach was and implemented and tested on a Pentium/90 PC experimental results have demonstrated its feasibility
Keywords :
computational complexity; document image processing; edge detection; feature extraction; image colour analysis; image representation; image segmentation; microcomputer applications; optical character recognition; 24 bit; OCR system; Pentium/90 PC; automated entry system; binary image; black-text images; colour printed documents; computational complexity reduction; edge representation; experimental results; monochrome documents processing; page segmentation; text blocks; text string extraction; true colour images; white-background images;
Journal_Title :
Vision, Image and Signal Processing, IEE Proceedings -
DOI :
10.1049/ip-vis:19960325