DocumentCode
3524548
Title
Automatic disambiguation of Latin abbreviations in early modern texts for humanities digital libraries
Author
Rydberg-Cox, Jeffrey A.
Author_Institution
Dept. of English, Missouri Univ., Kansas City, MO, USA
fYear
2003
fDate
27-31 May 2003
Firstpage
372
Lastpage
373
Abstract
Early modern books written in Latin contain many abbreviations of common words that are derived from earlier manuscript practice. While these abbreviations are usually easily deciphered by a reader well-versed in Latin, they pose technical problems for full text digitization: they are difficult to OCR or have typed and - if they are not expanded correctly - they limit the effectiveness of information retrieval and reading support tools in the digital library. We describe a method for the automatic expansion and disambiguation of these abbreviations.
Keywords
digital libraries; humanities; information retrieval; optical character recognition; software tools; text analysis; word processing; Latin abbreviation automatic disambiguation; OCR; abbreviation automatic expansion; common word; full text digitization; humanities digital library; information retrieval; reading support tool; tagging early modern text; Books; Cities and towns; Costs; History; Humans; Information retrieval; Optical character recognition software; Software libraries; Tagging; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Libraries, 2003. Proceedings. 2003 Joint Conference on
Print_ISBN
0-7695-1939-3
Type
conf
DOI
10.1109/JCDL.2003.1204892
Filename
1204892
Link To Document