• DocumentCode
    2708918
  • Title

    Automatic detection of italic, bold and all-capital words in document images

  • Author

    Chaudhuri, B.B. ; Garain, U.

  • Author_Institution
    Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India
  • Volume
    1
  • fYear
    1998
  • fDate
    16-20 Aug 1998
  • Firstpage
    610
  • Abstract
    We propose simple and fast algorithms for detection of italic, bold and all-capital words without doing actual character recognition. We present a statistical study which reveals that the detection of such words may play a key role in automatic information retrieval from documents. Moreover, detection of italic words can be used to improve the recognition accuracy of a text recognition system. Considerable number of document images have been tested and our algorithms give accurate results on all the tested images, and the algorithms are very easy to implement
  • Keywords
    document image processing; optical character recognition; OCR; all-capital word detection; automatic word detection; bold word detection; document image processing; italic word detection; text recognition system; Books; Character recognition; Computer vision; Degradation; Information retrieval; Optical character recognition software; Pattern recognition; Software systems; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on
  • Conference_Location
    Brisbane, Qld.
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-8512-3
  • Type

    conf

  • DOI
    10.1109/ICPR.1998.711217
  • Filename
    711217