• DocumentCode
    312003
  • Title

    A category based approach for recognition of out-of-vocabulary words

  • Author

    Gallwitz, F. ; Nöth, E. ; Niemann, H.

  • Author_Institution
    Lehrstuhl fur Mustererkennung, Erlangen-Nurnberg Univ., Germany
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    228
  • Abstract
    In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. We present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate
  • Keywords
    computational linguistics; natural language interfaces; probability; speech recognition; statistical analysis; vocabulary; acoustic model; automatic speech recognition; category based approach; out-of-vocabulary word recognition; spontaneous speech data; spontaneous speech tasks; statistical language models; training corpus; vocabulary; word emission probability; word error rate; Acoustic applications; Acoustic emission; Automatic speech recognition; Context modeling; Information retrieval; Predictive models; Probability; Speech recognition; Telephony; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607083
  • Filename
    607083