Title :
Optical Formula Extraction Based on Irregularity Degree
Author :
Tian, Xue-dong ; Tian, Da-Zeng ; Ha, Ming-Hu
Author_Institution :
Coll. of Phys. Sci. & Technol., Hebei Univ., Baoding
Abstract :
Optical formula extraction is considered as an important step of mathematical formula recognition, which can convert scientific papers into their corresponding electronic format. So far little research has been done in this area. This paper proposes an approach of extracting embedded formulas that first invokes a searching algorithm to find the connected components of the input document, calculates the layout feature of every component based on irregularity degree, and then locates the formula symbols according to the features. Finally, several measurements including linking grammar are used to locate the formula areas. The experimental results indicate that the proposed method can obtain favorable results
Keywords :
feature extraction; optical character recognition; symbol manipulation; connected component; electronic format conversion; embedded formulas; formula symbol; irregularity degree; layout feature; linking grammar; mathematical formula recognition; optical formula extraction; scientific papers; searching algorithm; Character recognition; Cybernetics; Educational institutions; Image storage; Joining processes; Labeling; Machine learning; Mathematics; Optical character recognition software; Pattern recognition; Physics; Text recognition; Optical formula recognition; connected components; formula extraction; irregularity degree; linking grammar;
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
DOI :
10.1109/ICMLC.2006.258476