• DocumentCode
    442177
  • Title

    Research on optical formulas extraction

  • Author

    Tian, Xue-dong ; Sun, Wei-Zhong ; Ha, Ming-Hu

  • Author_Institution
    Coll. of Phys. Sci. & Technol., Hebei Univ., Baoding, China
  • Volume
    8
  • fYear
    2005
  • fDate
    18-21 Aug. 2005
  • Firstpage
    4886
  • Abstract
    Automatic recognition and reconstruction of formulas are key parts in an OCR (optical character recognition) system. Mathematical formula extraction is the first step in this technique. Little has been done in this area. Some research was focused on mathematical formulas in printed documents. An approach containing both the MSE feature of CCXs and heuristic rules for mathematical formula extraction is proposed. The MSF feature of the CCXs based approach is used to extract isolated formulas from printed documents and some heuristic rules are used to extract the embedded formulas from image blocks. The experiments indicate that a combination of the two methods can obtain favorable results.
  • Keywords
    feature extraction; optical character recognition; CCX; MSE feature; OCR system; automatic formula recognition; automatic formula reconstruction; heuristic rules; optical characters recognition; optical mathematical formula extraction; Character recognition; Data mining; Educational institutions; Image reconstruction; Image storage; Internet; Optical character recognition software; Physics; Software libraries; Sun; MSE feature; OCR; heuristic rule; optical formula extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
  • Conference_Location
    Guangzhou, China
  • Print_ISBN
    0-7803-9091-1
  • Type

    conf

  • DOI
    10.1109/ICMLC.2005.1527803
  • Filename
    1527803