• DocumentCode
    2979423
  • Title

    Analyzing and predicting language model improvements

  • Author

    Iyer, R. ; Ostendorf, M. ; Meteer, M.

  • Author_Institution
    Electr. & Comput. Eng. Dept., Boston Univ., MA, USA
  • fYear
    1997
  • fDate
    14-17 Dec 1997
  • Firstpage
    254
  • Lastpage
    261
  • Abstract
    Statistical n-gram language models are traditionally developed using perplexity as a measure of goodness. However, perplexity often demonstrates a poor correlation with recognition improvements, mainly because it fails to account for the acoustic confusability between words and for search errors in a recognizer. In this paper, we study alternatives to perplexity for predicting language model performance, including other global features as well as a new approach that predicts, with a high correlation (0.96), performance differences associated with localized changes in language models, given a recognition system. Experiments focus on the problem of augmenting in-domain Switchboard text with out-of-domain text from the Wall Street Journal and broadcast news that differ in both style and content from the in-domain data
  • Keywords
    natural languages; nomograms; performance index; speech recognition; statistics; Wall Street Journal; acoustic confusability; broadcast news; correlation; global features; goodness measure; in-domain Switchboard text; language model improvements; language model performance prediction; localized changes; out-of-domain text; performance differences; perplexity; search errors; speech recognition improvements; statistical n-gram language models; text content; text style; word confusion; Acoustic measurements; Broadcasting; Degradation; Error analysis; Internet; Natural languages; Predictive models; Probability; Speech recognition; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
  • Conference_Location
    Santa Barbara, CA
  • Print_ISBN
    0-7803-3698-4
  • Type

    conf

  • DOI
    10.1109/ASRU.1997.659013
  • Filename
    659013