• DocumentCode
    3524548
  • Title

    Automatic disambiguation of Latin abbreviations in early modern texts for humanities digital libraries

  • Author

    Rydberg-Cox, Jeffrey A.

  • Author_Institution
    Dept. of English, Missouri Univ., Kansas City, MO, USA
  • fYear
    2003
  • fDate
    27-31 May 2003
  • Firstpage
    372
  • Lastpage
    373
  • Abstract
    Early modern books written in Latin contain many abbreviations of common words that are derived from earlier manuscript practice. While these abbreviations are usually easily deciphered by a reader well-versed in Latin, they pose technical problems for full text digitization: they are difficult to OCR or have typed and - if they are not expanded correctly - they limit the effectiveness of information retrieval and reading support tools in the digital library. We describe a method for the automatic expansion and disambiguation of these abbreviations.
  • Keywords
    digital libraries; humanities; information retrieval; optical character recognition; software tools; text analysis; word processing; Latin abbreviation automatic disambiguation; OCR; abbreviation automatic expansion; common word; full text digitization; humanities digital library; information retrieval; reading support tool; tagging early modern text; Books; Cities and towns; Costs; History; Humans; Information retrieval; Optical character recognition software; Software libraries; Tagging; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Libraries, 2003. Proceedings. 2003 Joint Conference on
  • Print_ISBN
    0-7695-1939-3
  • Type

    conf

  • DOI
    10.1109/JCDL.2003.1204892
  • Filename
    1204892