• DocumentCode
    1635977
  • Title

    Separate Chinese Character and English Character by Cascade Classifier and Feature Selection

  • Author

    Zhu, Yuanping ; Sun, Jun ; Minagawa, Akihiro ; Hotta, Yoshinobu ; Naoi, Satoshi

  • Author_Institution
    Fujitsu R&D Center Co., Ltd., Beijing, China
  • fYear
    2009
  • Firstpage
    1191
  • Lastpage
    1195
  • Abstract
    The separation of Chinese character and English character is helpful for OCR technique. In this paper, a multi-level cascade classifier combined with feature selection is constructed to identify Chinese character and English character based on individual character. Most of samples are identified by the first node classifier, the remained low classification confidence samples are fed to the next node classifiers to get the final result. For the motivation of utilizing feature complementarity, each node classifier is trained on low classification confidence samples of its previous node classifier with independent feature selection. Furthermore, a confidence bias is utilized to improve the classifier generalization. The experiment results validate the effectiveness of this classifier.
  • Keywords
    feature extraction; image classification; image sampling; natural languages; optical character recognition; Chinese character; English character; OCR technique; confidence bias; feature selection; image sample; multilevel cascade classifier; Character recognition; Diversity reception; Laboratories; Natural languages; Optical character recognition software; Prototypes; Research and development; Support vector machine classification; Support vector machines; Text analysis; Cascade Classifier; Feature Selection; Language Identification; OCR;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4244-4500-4
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2009.164
  • Filename
    5277617