Title :
Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis
Author :
Garain, Utpal ; Chaudhuri, Bidyut B.
Author_Institution :
Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Kolkata, India
Abstract :
One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. Existence of touching characters in the scanned documents is a major problem to design an effective character segmentation procedure. In this paper, a new technique is presented for identification and segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters. The proposed method has been applied to printed documents in Devnagari and Bangla: the two most popular scripts of the Indian sub-continent. The results obtained from a test-set of considerable size show that a reasonable improvement in recognition rate can be achieved with a modest increase in computations.
Keywords :
fuzzy set theory; image segmentation; optical character recognition; Bangla; Devnagari; Indian script; character segmentation; fuzzy multifactorial analysis; optical character recognition; predictive algorithm; touching characters; Character recognition; Decision making; Fuzzy systems; Image recognition; Image segmentation; Optical character recognition software; Pattern recognition; Prediction algorithms; Testing; Text recognition;
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
DOI :
10.1109/TSMCC.2002.807272