Title :
A Text Line Detection Method for Mathematical Formula Recognition
Author :
Xiaoyan Lin ; Liangcai Gao ; Zhi Tang ; Baker, James ; Alkalai, Mohamed ; Sorge, Volker
Author_Institution :
Inst. of Comput. Sci. & Technol., Peking Univ., Beijing, China
Abstract :
Text line detection is a prerequisite procedure of mathematical formula recognition, however, many incorrectly segmented text lines are often produced due to the two-dimensional structures of mathematics when using existing segmentation methods such as Projection Profiles Cutting or white space analysis. In consequence, mathematical formula recognition is adversely affected by these incorrectly detected text lines, with errors propagating through further processes. Aimed at mathematical formula recognition, we propose a text line detection method to produce reliable line segmentation. Based on the results produced by PPC, a learning based merging strategy is presented to combine incorrectly split text lines. In the merging strategy, the features of layout and text for a text line and those between successive lines are utilised to detect the incorrectly split text lines. Experimental results show that the proposed approach obtains good performance in detecting text lines from mathematical documents. Furthermore, the error rate in mathematical formula identification is reduced significantly through adopting the proposed text line detection method.
Keywords :
image segmentation; learning (artificial intelligence); text analysis; text detection; PPC; error rate; incorrectly split text line detection method; layout features; learning-based merging strategy; mathematical documents; mathematical formula identification; mathematical formula recognition; text features; text line segmentation; two-dimensional mathematics structures; Accuracy; Feature extraction; Layout; Merging; Testing; Text recognition; Training; Text line detection; mathematical formula identification; mathematical formula recognition;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.75