• DocumentCode
    3341235
  • Title

    A Large-Scale Analysis of Mathematical Expressions for an Accurate Understanding of Their Structure

  • Author

    Aly, Walaa ; Uchida, Seiichi ; Suzuki, Masakazu

  • Author_Institution
    Kyushu Univ., Fukuoka
  • fYear
    2008
  • fDate
    16-19 Sept. 2008
  • Firstpage
    549
  • Lastpage
    556
  • Abstract
    A wide variety of mathematical expressions printed in scientific and technical reports can be recognized by analyzing the two-dimensional layout structure. In this paper, the position relation between adjacent characters is analyzed for the purpose of automatic discrimination between baseline, subscript, and superscript characters. This analyzing is one of the most important parts of structure analysis. The proposed method is very promising, as the results reached up to (99.76%) over a very large database by using distribution map. This distribution map is defined by two important features, i.e., relative size and relative position.
  • Keywords
    document image processing; mathematics computing; optical character recognition; very large databases; automatic discrimination; distribution map; large-scale analysis; layout structure; math OCR; mathematical expressions; position relation; scientific reports; structure analysis; technical reports; very large database; Character recognition; Information analysis; Large-scale systems; Optical character recognition software; Pattern recognition; Performance analysis; Spatial databases; Text analysis; Text recognition; Writing; Baseline characters; Mathematical documents; Subscript characters; Supscript characters;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
  • Conference_Location
    Nara
  • Print_ISBN
    978-0-7695-3337-7
  • Type

    conf

  • DOI
    10.1109/DAS.2008.53
  • Filename
    4670005