DocumentCode
682621
Title
ATHENA: Automatic text height extraction for the analysis of old handwritten manuscripts
Author
Pintus, Ruggero ; Ying Yang ; Rushmeier, Holly
Volume
1
fYear
2013
fDate
Oct. 28 2013-Nov. 1 2013
Firstpage
605
Lastpage
612
Abstract
A massive digital acquisition of huge sets of deteriorating historical documents is mandatory due to their value and delicacy. The study and the browsing of such digital libraries is becoming crucial for scholars in the Cultural Heritage field, but it requires automatic tools for analyzing and indexing those dataset items. We present here a layout analysis method to perform automatic text height estimation, without the need of any kind of manual intervention and user defined parameters. It proves to be a robust technique in the case of very noisy and damaged handwritten manuscripts. The effectiveness of the method is demonstrated on a huge heterogeneous corpus of medieval manuscripts, with different writing styles, and affected by other uncontrollable factors, such as ink bleed-through, background noise, and overtyping text lines.
Keywords
digital libraries; feature extraction; history; text analysis; ATHENA; automatic text height estimation; automatic text height extraction; background noise; cultural heritage; deteriorating historical documents; digital acquisition; digital libraries; ink bleed-through; layout analysis method; medieval manuscripts; old handwritten manuscripts analysis; overtyping text lines; writing styles; Correlation; Discrete Fourier transforms; Estimation; Image edge detection; Indexes; Layout; Robustness;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Heritage International Congress (DigitalHeritage), 2013
Conference_Location
Marseille
Print_ISBN
978-1-4799-3168-2
Type
conf
DOI
10.1109/DigitalHeritage.2013.6743802
Filename
6743802
Link To Document