Title :
Combination of binarization and character segmentation using color information
Author :
Thillou, Céline ; Gosselin, Bemard
Author_Institution :
Faculte Polytechnique de Mons, Belgium
Abstract :
Character segmentation and recognition have been performed for several decades, especially typewritten characters from scanner. Commercial OCR softwares perform well on "clean" documents or need user to select the kind of documents. Recently, a new kind of images taken by a camera in a "real-world" environment appeared. It implies different strong degradations missing in scanner-based pictures and the presence of complex backgrounds. In order to segment text as properly as possible, a new method is proposed using color information in order to extract text as well as possible. In this paper, a focus is given on each chosen parameter with comparative results between different recent techniques using color information. Moreover an emphasis is placed on stroke analysis and character segmentation. The binarization method takes it into account in order to improve character segmentation and recognition afterwards.
Keywords :
cameras; image colour analysis; image scanners; image segmentation; optical character recognition; OCR software; binarization combination; camera; character recognition; character segmentation; color clustering; color information; scanner-based picture; stroke analysis; Cameras; Character recognition; Clustering algorithms; Data processing; Degradation; Image processing; Image segmentation; Optical character recognition software; Robustness; Testing;
Conference_Titel :
Signal Processing and Information Technology, 2004. Proceedings of the Fourth IEEE International Symposium on
Print_ISBN :
0-7803-8689-2
DOI :
10.1109/ISSPIT.2004.1433699