DocumentCode
442177
Title
Research on optical formulas extraction
Author
Tian, Xue-dong ; Sun, Wei-Zhong ; Ha, Ming-Hu
Author_Institution
Coll. of Phys. Sci. & Technol., Hebei Univ., Baoding, China
Volume
8
fYear
2005
fDate
18-21 Aug. 2005
Firstpage
4886
Abstract
Automatic recognition and reconstruction of formulas are key parts in an OCR (optical character recognition) system. Mathematical formula extraction is the first step in this technique. Little has been done in this area. Some research was focused on mathematical formulas in printed documents. An approach containing both the MSE feature of CCXs and heuristic rules for mathematical formula extraction is proposed. The MSF feature of the CCXs based approach is used to extract isolated formulas from printed documents and some heuristic rules are used to extract the embedded formulas from image blocks. The experiments indicate that a combination of the two methods can obtain favorable results.
Keywords
feature extraction; optical character recognition; CCX; MSE feature; OCR system; automatic formula recognition; automatic formula reconstruction; heuristic rules; optical characters recognition; optical mathematical formula extraction; Character recognition; Data mining; Educational institutions; Image reconstruction; Image storage; Internet; Optical character recognition software; Physics; Software libraries; Sun; MSE feature; OCR; heuristic rule; optical formula extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location
Guangzhou, China
Print_ISBN
0-7803-9091-1
Type
conf
DOI
10.1109/ICMLC.2005.1527803
Filename
1527803
Link To Document