• DocumentCode
    3022684
  • Title

    OCR based slide retrieval

  • Author

    Daddaoua, N. ; Odobez, J.-M. ; Vinciarelli, A.

  • Author_Institution
    IDIAP Res. Inst., Martigny, Switzerland
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Firstpage
    945
  • Abstract
    This paper addresses the problem of acquiring, indexing and retrieving slides in the context of automatic oral presentation processing. Since the most suitable acquisition technique, in such a context, is the use of a framegrabber (a device capturing as images the slides displayed on a screen), the slides must be transcribed with an optical character recognition system. Retrieval experiments performed on a corpus of 570 slides (26 presentations) gathered at a workshop show that performance obtained with the OCR transcriptions are close to those obtained by extracting the text from the electronic version (pdf or ppt) of the slides (through apposite APIs).
  • Keywords
    feature extraction; image retrieval; indexing; optical character recognition; API; automatic oral presentation; framegrabber; optical character recognition; slide acquisition; slide indexing; slide retrieval; text extraction; Background noise; Character recognition; Image converters; Image retrieval; Image segmentation; Indexing; Information retrieval; Optical character recognition software; Optical devices; Video recording;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
  • ISSN
    1520-5263
  • Print_ISBN
    0-7695-2420-6
  • Type

    conf

  • DOI
    10.1109/ICDAR.2005.169
  • Filename
    1575683