• DocumentCode
    594725
  • Title

    Collecting historical font metrics from Google Books

  • Author

    LiVolsi, R. ; Zanibbi, Richard ; Bigelow, C.

  • fYear
    2012
  • fDate
    11-15 Nov. 2012
  • Firstpage
    351
  • Lastpage
    355
  • Abstract
    A system is presented for extracting key metrics from fonts used in historical documents. The system identifies important landmarks on a page, such as margins, paragraphs, and lines, and applies frequency analysis techniques to identify relevant sizes. The system was validated by comparing its measurements to the measurements of a human expert on randomly selected samples, and differed on average from the expert by less than 5% for x-height, body size, and line spacing metrics.
  • Keywords
    document image processing; history; Google books; frequency analysis techniques; historical documents; historical font metrics; line spacing metrics; randomly selected samples; Google; Humans; Image segmentation; Noise; Size measurement; Standards;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2012 21st International Conference on
  • Conference_Location
    Tsukuba
  • ISSN
    1051-4651
  • Print_ISBN
    978-1-4673-2216-4
  • Type

    conf

  • Filename
    6460144