Title :
A New Approach for Understanding of Structure of Printed Mathematical Expression
Author :
Guo, Yu-sheng ; Huang, Lei ; Liu, Chang-ping
Author_Institution :
Chinese Acad. of Sci., Beijing
Abstract :
This paper introduces a new approach for automatic understanding of structure of printed mathematical expression (ME). The method is consisted of three periods, i.e. matrix analysis, sub-expression analysis and script expression analysis. In matrix analysis (sub-expression analysis), a ME (sub-expression) is decomposed into several basic matrixes (sub-expressions) and some sub-expressions (script expressions) by reconstructing the ME global structure, and then every basic matrix (sub-expression) is analyzed from bottom to up. In script analysis, graph rewriting algorithm is adopted to build script relation trees among symbols within a script expression. In order to calculate spatial relations´ confidence between two symbols, spatial relation model is built based on Gaussian Mixture Model (GMM). The experiments were implemented on a database with 3268 images and the results show that the proposed method works well. Top-1 prefect analysis accuracy reaches 92.3%.
Keywords :
Gaussian processes; document image processing; matrix algebra; optical character recognition; trees (mathematics); Gaussian mixture model; graph rewriting; matrix analysis; optical character recognition; printed mathematical expression; script expression analysis; script relation trees; subexpression analysis; Algorithm design and analysis; Automation; Cybernetics; Image reconstruction; Machine learning; Mathematical model; Matrix decomposition; Optical character recognition software; Robustness; Tree graphs; Gaussian Mixture Model; Multi-candidate; Printed mathematical expression; Spatial relation model;
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
DOI :
10.1109/ICMLC.2007.4370593