• DocumentCode
    2631606
  • Title

    Detecting and locating partially specified keywords in scanned images using hidden Markov models

  • Author

    Chen, Francine R. ; Wilcox, Lynn D. ; Bloomberg, Dan S.

  • Author_Institution
    Xerox Palo Alto Res. Center, CA, USA
  • fYear
    1993
  • fDate
    20-22 Oct 1993
  • Firstpage
    133
  • Lastpage
    138
  • Abstract
    A hidden Markov model (HMM) based system for detecting locating, or spotting, user-specified keywords in scanned images is described. The system is font-independent, and no pre-segmentation of text and graphics is required. The bounding boxes of potential lines of text are extracted from the image using morphology. Feature vectors based on the external shape and internal structure of characters are computed for each bounding box. A keyword HMM is created by concatenating appropriate context-dependent character HMMs. The non-keyword HMM is based on context-dependent sub-character models. Keywords are spotted using Viterbi decoding on an HMM network created from the keyword and non-keyword HMMs. This model allows detection of keywords embedded in a line without pre-segmentation of the line into words or characters. Thus keywords may be specified by a baseform and variants of the keyword can be detected
  • Keywords
    Viterbi decoding; feature extraction; hidden Markov models; word processing; Viterbi decoding; bounding box; bounding boxes; context-dependent character HMMs; context-dependent sub-character models; external shape; feature vectors; font-independent; hidden Markov model; internal structure; keyword HMM; morphology; partially specified keywords; scanned images; user-specified keywords; Character recognition; Facsimile; Graphics; Hidden Markov models; Image recognition; Image retrieval; Image segmentation; Information retrieval; Morphology; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
  • Conference_Location
    Tsukuba Science City
  • Print_ISBN
    0-8186-4960-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1993.395765
  • Filename
    395765