DocumentCode :
2030620
Title :
Word spotting in scanned images using hidden Markov models
Author :
Chen, Francine R. ; Wilcox, Lynn U. ; Bloomberg, Dun S.
Author_Institution :
Xerox Palo Alto Res. Center, CA, USA
Volume :
5
fYear :
1993
fDate :
27-30 April 1993
Firstpage :
1
Abstract :
A hidden-Markov-model (HMM)-based system for font-independent spotting of user-specified keywords in a scanned image is described. Word bounding boxes of potential keywords are extracted from the image using a morphology-based preprocessor. Feature vectors based on the external shape and internal structure of the word are computed over vertical columns of pixels in a word bounding box. For each user-specified keyword, an HMM is created by concatenating appropriate context-dependent character HMMs. Nonkeywords are modeled using an HMM based on context-dependent subcharacter models. Keyword spotting is performed using a Viterbi search through the HMM network created by connecting the keyword and nonkeyword HMMs in parallel. Applications of word-image spotting include information filtering in images from facsimile and copy machines, and information retrieval from text image databases.<>
Keywords :
hidden Markov models; image segmentation; mathematical morphology; optical character recognition; search problems; HMM network; Viterbi search; context-dependent subcharacter models; font-independent spotting; hidden Markov models; morphology-based preprocessor; scanned images; user-specified keywords; word bounding box; word-image spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.1993.319732
Filename :
319732
Link To Document :
بازگشت