• DocumentCode
    1742153
  • Title

    Automatic training of page segmentation algorithms: an optimization approach

  • Author

    Mao, Song ; Kanungo, Tapas

  • Author_Institution
    Center for Autom. Res., Maryland Univ., College Park, MD, USA
  • Volume
    4
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    531
  • Abstract
    Most page segmentation algorithms have user-specifiable free parameters. However, algorithm designers typically do not provide a quantitative/rigorous method for choosing values for these parameters. The free parameter values can affect the segmentation result quite drastically and are very dependent on the particular dataset that the algorithm is being used on. We present an automatic training method for choosing free parameters of page segmentation algorithms. The automatic training problem is posed as a multivariate non-smooth function optimization problem. An efficient direct search method-simplex method-is used to solve this optimization problem. This training method is used applied to the training of Kise´s page segmentation algorithm. It is found that a set of optimal parameter values and their corresponding performance index can be found using relatively few function evaluations. The UW III dataset was used for conducting our experiments
  • Keywords
    document image processing; image segmentation; optimisation; search problems; UW III dataset; automatic training; direct search method; multivariate nonsmooth function optimization problem; optimization approach; page segmentation algorithms; performance index; simplex method; Algorithm design and analysis; Automation; Educational institutions; Image segmentation; Laboratories; Measurement; Optical character recognition software; Optimization methods; Search methods; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2000. Proceedings. 15th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-0750-6
  • Type

    conf

  • DOI
    10.1109/ICPR.2000.902974
  • Filename
    902974