Title :
Automatic training of page segmentation algorithms: an optimization approach
Author :
Mao, Song ; Kanungo, Tapas
Author_Institution :
Center for Autom. Res., Maryland Univ., College Park, MD, USA
Abstract :
Most page segmentation algorithms have user-specifiable free parameters. However, algorithm designers typically do not provide a quantitative/rigorous method for choosing values for these parameters. The free parameter values can affect the segmentation result quite drastically and are very dependent on the particular dataset that the algorithm is being used on. We present an automatic training method for choosing free parameters of page segmentation algorithms. The automatic training problem is posed as a multivariate non-smooth function optimization problem. An efficient direct search method-simplex method-is used to solve this optimization problem. This training method is used applied to the training of Kise´s page segmentation algorithm. It is found that a set of optimal parameter values and their corresponding performance index can be found using relatively few function evaluations. The UW III dataset was used for conducting our experiments
Keywords :
document image processing; image segmentation; optimisation; search problems; UW III dataset; automatic training; direct search method; multivariate nonsmooth function optimization problem; optimization approach; page segmentation algorithms; performance index; simplex method; Algorithm design and analysis; Automation; Educational institutions; Image segmentation; Laboratories; Measurement; Optical character recognition software; Optimization methods; Search methods; Training data;
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-0750-6
DOI :
10.1109/ICPR.2000.902974