• DocumentCode
    2576866
  • Title

    Applying Bayesian belief networks in approximate string matching for robust keyword-based retrieval

  • Author

    Schuller, Björn ; Muller, Ronald ; Rigoll, Gerhard ; Lang, Manfred

  • Author_Institution
    Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
  • Volume
    3
  • fYear
    2004
  • fDate
    27-30 June 2004
  • Firstpage
    1999
  • Abstract
    We present a novel approach towards robust keyword-based retrieval. Bayesian belief networks are applied in a word-model based approximate string matching algorithm. Apart from a proven reliable performance in a working implementation on standard sources like digital text, wholly probabilistic modeling allows for integration of confidence measures and hypotheses obtained from preprocessing stages, like handwriting recognition or optical character recognition, respecting uncertainties on the lower levels. Furthermore, a flexible method to include the modeling of specific error types derived from humans and various input sources is provided. The remarkable performance of the algorithms presented was tested during extensive evaluation with respect to the Levenstein distance, which can be seen as the basis of state-of-the-art methods in this research field. The tests ran on a 14 K database containing common international music titles and four 10 K databases consisting of the most frequently used words in English, German, French and Dutch.
  • Keywords
    approximation theory; belief networks; information retrieval; natural languages; string matching; text analysis; 10 K; 14 K; Bayesian belief networks; Dutch; English; French; German; Levenstein distance; approximate string matching; confidence measures; digital text; handwriting recognition; international music titles; optical character recognition; probabilistic modeling; robust keyword-based retrieval; Bayesian methods; Character recognition; Databases; Handwriting recognition; Humans; Integrated optics; Measurement standards; Optical character recognition software; Robustness; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
  • Print_ISBN
    0-7803-8603-5
  • Type

    conf

  • DOI
    10.1109/ICME.2004.1394655
  • Filename
    1394655