DocumentCode :
3524548
Title :
Automatic disambiguation of Latin abbreviations in early modern texts for humanities digital libraries
Author :
Rydberg-Cox, Jeffrey A.
Author_Institution :
Dept. of English, Missouri Univ., Kansas City, MO, USA
fYear :
2003
fDate :
27-31 May 2003
Firstpage :
372
Lastpage :
373
Abstract :
Early modern books written in Latin contain many abbreviations of common words that are derived from earlier manuscript practice. While these abbreviations are usually easily deciphered by a reader well-versed in Latin, they pose technical problems for full text digitization: they are difficult to OCR or have typed and - if they are not expanded correctly - they limit the effectiveness of information retrieval and reading support tools in the digital library. We describe a method for the automatic expansion and disambiguation of these abbreviations.
Keywords :
digital libraries; humanities; information retrieval; optical character recognition; software tools; text analysis; word processing; Latin abbreviation automatic disambiguation; OCR; abbreviation automatic expansion; common word; full text digitization; humanities digital library; information retrieval; reading support tool; tagging early modern text; Books; Cities and towns; Costs; History; Humans; Information retrieval; Optical character recognition software; Software libraries; Tagging; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries, 2003. Proceedings. 2003 Joint Conference on
Print_ISBN :
0-7695-1939-3
Type :
conf
DOI :
10.1109/JCDL.2003.1204892
Filename :
1204892
Link To Document :
بازگشت