DocumentCode :
3341235
Title :
A Large-Scale Analysis of Mathematical Expressions for an Accurate Understanding of Their Structure
Author :
Aly, Walaa ; Uchida, Seiichi ; Suzuki, Masakazu
Author_Institution :
Kyushu Univ., Fukuoka
fYear :
2008
fDate :
16-19 Sept. 2008
Firstpage :
549
Lastpage :
556
Abstract :
A wide variety of mathematical expressions printed in scientific and technical reports can be recognized by analyzing the two-dimensional layout structure. In this paper, the position relation between adjacent characters is analyzed for the purpose of automatic discrimination between baseline, subscript, and superscript characters. This analyzing is one of the most important parts of structure analysis. The proposed method is very promising, as the results reached up to (99.76%) over a very large database by using distribution map. This distribution map is defined by two important features, i.e., relative size and relative position.
Keywords :
document image processing; mathematics computing; optical character recognition; very large databases; automatic discrimination; distribution map; large-scale analysis; layout structure; math OCR; mathematical expressions; position relation; scientific reports; structure analysis; technical reports; very large database; Character recognition; Information analysis; Large-scale systems; Optical character recognition software; Pattern recognition; Performance analysis; Spatial databases; Text analysis; Text recognition; Writing; Baseline characters; Mathematical documents; Subscript characters; Supscript characters;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location :
Nara
Print_ISBN :
978-0-7695-3337-7
Type :
conf
DOI :
10.1109/DAS.2008.53
Filename :
4670005
Link To Document :
بازگشت