DocumentCode :
682621
Title :
ATHENA: Automatic text height extraction for the analysis of old handwritten manuscripts
Author :
Pintus, Ruggero ; Ying Yang ; Rushmeier, Holly
Volume :
1
fYear :
2013
fDate :
Oct. 28 2013-Nov. 1 2013
Firstpage :
605
Lastpage :
612
Abstract :
A massive digital acquisition of huge sets of deteriorating historical documents is mandatory due to their value and delicacy. The study and the browsing of such digital libraries is becoming crucial for scholars in the Cultural Heritage field, but it requires automatic tools for analyzing and indexing those dataset items. We present here a layout analysis method to perform automatic text height estimation, without the need of any kind of manual intervention and user defined parameters. It proves to be a robust technique in the case of very noisy and damaged handwritten manuscripts. The effectiveness of the method is demonstrated on a huge heterogeneous corpus of medieval manuscripts, with different writing styles, and affected by other uncontrollable factors, such as ink bleed-through, background noise, and overtyping text lines.
Keywords :
digital libraries; feature extraction; history; text analysis; ATHENA; automatic text height estimation; automatic text height extraction; background noise; cultural heritage; deteriorating historical documents; digital acquisition; digital libraries; ink bleed-through; layout analysis method; medieval manuscripts; old handwritten manuscripts analysis; overtyping text lines; writing styles; Correlation; Discrete Fourier transforms; Estimation; Image edge detection; Indexes; Layout; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Heritage International Congress (DigitalHeritage), 2013
Conference_Location :
Marseille
Print_ISBN :
978-1-4799-3168-2
Type :
conf
DOI :
10.1109/DigitalHeritage.2013.6743802
Filename :
6743802
Link To Document :
بازگشت