• DocumentCode
    2338395
  • Title

    A recognition algorithm for Chinese characters in diverse fonts

  • Author

    Wu, Xianli ; Wu, Min

  • Author_Institution
    Eng. Center of Character Recognition, Chinese Acad. of Sci., Beijing, China
  • Volume
    3
  • fYear
    2002
  • fDate
    24-28 June 2002
  • Firstpage
    981
  • Abstract
    The paper proposes an algorithm for recognizing Chinese characters in many diverse fonts including Song, Fang, Kai, Hei, Yuan, Lishu, Weibei and Xingkai. The algorithm is based on features derived from peripheral direction contributions and utilizes a set of dictionaries. A 3-level matching is first performed with respect to each dictionary. The distance measures associated with these matches are then fed into a central discriminator to output the final recognition result. We propose a new multi-dictionary matching algorithm for use in the central discriminator that utilizes estimated information of neighborhood fonts. Experiments have been performed on a practical OCR software system whose recognition kernel is based on the proposed algorithm. Fast and accurate recognition has been accomplished both in title recognition, involving all of the 8 fonts, and in main-body recognition, that usually involves only the first 4 most commonly used fonts.
  • Keywords
    character sets; image matching; natural language interfaces; optical character recognition; Chinese character recognition algorithm; OCR software system; central discriminator; diverse fonts; multi-dictionary matching algorithm; peripheral direction contributions; Asia; Character recognition; Dictionaries; Feature extraction; Kernel; Natural languages; Optical character recognition software; Research and development; Software algorithms; Software systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing. 2002. Proceedings. 2002 International Conference on
  • ISSN
    1522-4880
  • Print_ISBN
    0-7803-7622-6
  • Type

    conf

  • DOI
    10.1109/ICIP.2002.1039139
  • Filename
    1039139