• DocumentCode
    3695074
  • Title

    Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs

  • Author

    Shengchang Chen;Shujing Lu;Ying Wen;Yue Lu

  • Author_Institution
    Shanghai Key Laboratory of Multidimensional Information Processing, Department of Computer Science and Technology, East China Normal University, 200241, China
  • fYear
    2015
  • Firstpage
    151
  • Lastpage
    155
  • Abstract
    Different recognizers may result in different mistakes when they are used to recognize a Chinese address. In this paper, we present a method of combining multiple Chinese address recognition outputs to improve Chinese address recognition accuracy. The method first employs multiple sequence alignment to generate a lattice of candidate hypotheses from multiple different recognizer outputs and then applies statistical language model to choose the maximum likelihood candidate sequence. Taking the maximum as the final decision, the performance of our method is superior, compared to the single recognizers and Miyao´s method. The experiments on the address images of real envelopes demonstrate that the proposed method increases the character recognition accuracy rate from 95.80% to 98.38%, with 61.30% error reduction. Furthermore, the corrected sorting rate of an automatic mail sorting system increases from 84.11% to 93.72%.
  • Keywords
    "Image segmentation","Training","Optical character recognition software"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333742
  • Filename
    7333742