DocumentCode
312004
Title
Modeling long distance dependence in language: topic mixtures vs. dynamic cache models
Author
Iyer, R. ; Ostendorf, M.
Author_Institution
Dept. of Electr. & Comput. Eng., Boston Univ., MA, USA
Volume
1
fYear
1996
fDate
3-6 Oct 1996
Firstpage
236
Abstract
We investigate a new statistical language model which captures topic-related dependencies of words within and across sentences. First, we develop a sentence-level mixture language model that takes advantage of the topic constraints in a sentence or article. Second, we introduce topic-dependent dynamic cache adaptation techniques in the framework of the mixture model. Experiments with the static (or unadapted) mixture model on the 1994 WSJ task indicated a 21% reduction in perplexity and a 3-4% improvement in recognition accuracy over a general n-gram model. The static mixture model also improved recognition performance over an adapted n-gram model. Mixture adaptation techniques contributed a further 14% reduction in perplexity and a small improvement in recognition accuracy
Keywords
cache storage; computational linguistics; natural language interfaces; probability; software performance evaluation; speech recognition; statistical analysis; WSJ task; dynamic cache models; long distance dependence; n-gram model; sentence-level mixture language model; speech recognition performance; static mixture model; statistical language model; topic constraints; topic mixtures; topic-dependent dynamic cache adaptation; topic-related dependencies; words; Costs; Equations; Markov processes; Mars; Natural languages; Parameter estimation; Probability; Robustness; Speech recognition; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607085
Filename
607085
Link To Document