• DocumentCode
    1811684
  • Title

    Amharic character recognition using a fast signature based algorithm

  • Author

    Cowell, John ; Hussain, Fiaz

  • Author_Institution
    Dept. of Comput. Sci., De Montfort Univ., Leicester, UK
  • fYear
    2003
  • fDate
    16-18 July 2003
  • Firstpage
    384
  • Lastpage
    389
  • Abstract
    The Amharic language is the principal language of over 20 million people mainly in Ethiopia. An extensive literature survey reveals no journal or conference papers on Amharic character recognition. The Amharic script has 33 basic characters each with seven orders giving 231 distinct characters, not including numbers and punctuation symbols. The characters are cursive but not connected and unlike other cursive scripts do not use dots. We describe the Amharic script and discuss the difficulties of applying conventional structural and syntactic recognition processes. Two statistical algorithms for identifying Amharic characters are described. In both, the characters are normalised for both size and orientation. The first compares the character against a series of templates. The second derives a characteristic signature from the character and compares this against a set of signature templates. The signatures used are fifty times smaller than the original character and the recognition process is corresponding faster but with some loss of accuracy. The statistical techniques described have been fully implemented and the resulting performance outlined.
  • Keywords
    character sets; computational linguistics; handwritten character recognition; natural languages; optical character recognition; Amharic character recognition; Amharic language; Amharic script; OCR; basic character; characteristic signature; confusion matrix; cursive script; distinct character; optical character recognition; punctuation symbol; signature based algorithm; signature template; structural recognition process; syntactic recognition process; Character recognition; Computer science; Graphics; Information systems; Keyboards; Licenses; Optical character recognition software; Optical losses; Pricing; Road vehicles;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Visualization, 2003. IV 2003. Proceedings. Seventh International Conference on
  • Print_ISBN
    0-7695-1988-1
  • Type

    conf

  • DOI
    10.1109/IV.2003.1218014
  • Filename
    1218014