DocumentCode :
757997
Title :
Application of Information Retrieval Technologies to Presentation Slides
Author :
Vinciarelli, Alessandro ; Odobez, Jean-Marc
Author_Institution :
IDIAP Res. Inst., Martigny
Volume :
8
Issue :
5
fYear :
2006
Firstpage :
981
Lastpage :
995
Abstract :
Presentations are becoming an increasingly more common means of communication in working environments, and slides are often the necessary supporting material on which the presentations rely. In this paper, we describe a slide indexing and retrieval system in which the slides are captured as images (through a framegrabber) at the moment they are displayed during a presentation and then transcribed with an optical character recognition (OCR) system. In this context, we show that such an approach presents several advantages over the use of commercial software (API based) to obtain the slide transcriptions. We report a set of retrieval experiments conducted on a database of 26 real presentations (570 slides) collected at a workshop. The experiments show that the overall retrieval performance is close to that obtained using either a manual transcription of the slides or the API software. Moreover, the experiments show that the OCR-based approach outperforms significantly the API in extracting the text embedded in images and figures
Keywords :
application program interfaces; database indexing; image retrieval; information retrieval systems; multimedia systems; optical character recognition; API software; OCR system; image extraction; information retrieval system; manual transcription; optical character recognition; presentation slide indexing; slide transcription; Character recognition; Data mining; Databases; Image retrieval; Indexing; Information resources; Information retrieval; Optical character recognition software; Optical materials; Software performance; Indexing; information retrieval; noisy text; optical character recognition; presentations; slides;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2006.879870
Filename :
1703512
Link To Document :
بازگشت