Thinning Arabic characters for feature extraction

Author

Cowell, John ; Hussain, Fiaz

Author_Institution

Dept. of Comput. Sci., De Montfort Univ., Leicester, UK

fYear

2001

fDate

2001

Firstpage

181

Lastpage

185

Abstract

A successful approach to the recognition of Latin characters is to extract features from that character such as the number of strokes, stroke intersections and holes, and to use ad-hoc tests to differentiate between characters which have similar features. The first stage in this process is to produce thinned 1 pixel thick representations of the characters to simplify feature extraction. This approach works well with printed Latin characters which are of high quality. With poor quality characters, however, the thinning process itself is not straightforward and can introduce errors which are manifested in the later stages of the recognition process. The recognition of poor quality Arabic characters is a particular problem since the characters are calligraphic with printed characters having widely varying stroke thicknesses to simulate the drawing of the character with a calligraphy pen or brush. This paper describes the problems encountered when thinning large poor quality Arabic characters prior to the extraction of their features and submission to a syntactic recognition system

Keywords

character recognition; feature extraction; image thinning; 1 pixel thick representations; arabic characters; calligraphy; feature extraction; holes; poor quality Arabic characters; stroke intersections; strokes; syntactic recognition system; thinning process; Character recognition; Computer science; Costs; Feature extraction; Licenses; Optical character recognition software; Optical sensors; Skeleton; Testing; Vehicles;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Visualisation, 2001. Proceedings. Fifth International Conference on

Conference_Location

London

Print_ISBN

0-7695-1195-3

Type

conf

DOI

10.1109/IV.2001.942056

Filename

942056