• DocumentCode
    353702
  • Title

    Putting it all together: language model combination

  • Author

    Goodman, Joshua T.

  • Author_Institution
    Speech Technol. Group, Microsoft Corp., Redmond, WA, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1647
  • Abstract
    In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, modified Kneser-Ney smoothing and clustering. While all of these techniques have been studied separately, they have rarely been studied in combination. We find some significant interactions, especially with smoothing techniques. The combination of all techniques leads to up to a 45% perplexity reduction over a Katz (1987) smoothed trigram model with no count cutoffs, the highest such perplexity reduction reported
  • Keywords
    linguistics; natural languages; nomograms; pattern clustering; smoothing methods; speech processing; caching; clustering; count cutoffs; higher-order n-grams; language model combination; modified Kneser-Ney smoothing; perplexity reduction; skipping; smoothing techniques; trigram models; History; Interpolation; Smoothing methods; Speech recognition; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.862064
  • Filename
    862064