• DocumentCode
    682621
  • Title

    ATHENA: Automatic text height extraction for the analysis of old handwritten manuscripts

  • Author

    Pintus, Ruggero ; Ying Yang ; Rushmeier, Holly

  • Volume
    1
  • fYear
    2013
  • fDate
    Oct. 28 2013-Nov. 1 2013
  • Firstpage
    605
  • Lastpage
    612
  • Abstract
    A massive digital acquisition of huge sets of deteriorating historical documents is mandatory due to their value and delicacy. The study and the browsing of such digital libraries is becoming crucial for scholars in the Cultural Heritage field, but it requires automatic tools for analyzing and indexing those dataset items. We present here a layout analysis method to perform automatic text height estimation, without the need of any kind of manual intervention and user defined parameters. It proves to be a robust technique in the case of very noisy and damaged handwritten manuscripts. The effectiveness of the method is demonstrated on a huge heterogeneous corpus of medieval manuscripts, with different writing styles, and affected by other uncontrollable factors, such as ink bleed-through, background noise, and overtyping text lines.
  • Keywords
    digital libraries; feature extraction; history; text analysis; ATHENA; automatic text height estimation; automatic text height extraction; background noise; cultural heritage; deteriorating historical documents; digital acquisition; digital libraries; ink bleed-through; layout analysis method; medieval manuscripts; old handwritten manuscripts analysis; overtyping text lines; writing styles; Correlation; Discrete Fourier transforms; Estimation; Image edge detection; Indexes; Layout; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Heritage International Congress (DigitalHeritage), 2013
  • Conference_Location
    Marseille
  • Print_ISBN
    978-1-4799-3168-2
  • Type

    conf

  • DOI
    10.1109/DigitalHeritage.2013.6743802
  • Filename
    6743802