Title :
Keyword Spotting and Retrieval of Document Images Captured by a Digital Camera
Author :
Lu, Shijian ; Tan, Chew Lim
Author_Institution :
Nat. Univ. of Singapore, Singapore
Abstract :
This paper presents a keyword spotting technique that locates keywords within document images captured by a digital camera. In the proposed technique, the shape of word images in perspective view is captured by using three perspective invariants, namely, holes, water reservoirs, and character ascenders and descenders. Given a camera image of document, text line and word images are first segmented through the connected component analysis. The three perspective invariants are then detected through two rounds of scanning process, which transliterate each character image into a character shape code of dimension six and so convert each word image into a word shape code. Keywords within camera images of documents are finally located through a partial matching process. Experiments show some promising results.
Keywords :
image sensors; optical character recognition; character image; character shape code; digital camera; images captured document; keyword spotting technique; partial matching process; scanning process; text line; word images; Digital cameras; Image analysis; Image coding; Image converters; Image retrieval; Image segmentation; Optical character recognition software; Reservoirs; Shape; Water resources;
Conference_Titel :
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location :
Parana
Print_ISBN :
978-0-7695-2822-9
DOI :
10.1109/ICDAR.2007.4377064