• DocumentCode
    312004
  • Title

    Modeling long distance dependence in language: topic mixtures vs. dynamic cache models

  • Author

    Iyer, R. ; Ostendorf, M.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Boston Univ., MA, USA
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    236
  • Abstract
    We investigate a new statistical language model which captures topic-related dependencies of words within and across sentences. First, we develop a sentence-level mixture language model that takes advantage of the topic constraints in a sentence or article. Second, we introduce topic-dependent dynamic cache adaptation techniques in the framework of the mixture model. Experiments with the static (or unadapted) mixture model on the 1994 WSJ task indicated a 21% reduction in perplexity and a 3-4% improvement in recognition accuracy over a general n-gram model. The static mixture model also improved recognition performance over an adapted n-gram model. Mixture adaptation techniques contributed a further 14% reduction in perplexity and a small improvement in recognition accuracy
  • Keywords
    cache storage; computational linguistics; natural language interfaces; probability; software performance evaluation; speech recognition; statistical analysis; WSJ task; dynamic cache models; long distance dependence; n-gram model; sentence-level mixture language model; speech recognition performance; static mixture model; statistical language model; topic constraints; topic mixtures; topic-dependent dynamic cache adaptation; topic-related dependencies; words; Costs; Equations; Markov processes; Mars; Natural languages; Parameter estimation; Probability; Robustness; Speech recognition; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607085
  • Filename
    607085