DocumentCode :
2789658
Title :
Topic cache language model for speech recognition
Author :
Chueh, Chuang-Hua ; Chien, Jen-Tzung
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5194
Lastpage :
5197
Abstract :
Traditional n-gram language models suffer from insufficient long-distance information. The cache language model, which captures the dynamics of word occurrences in a cache, is feasible to compensate this weakness. This paper presents a new topic cache model for speech recognition based on the latent Dirichlet language model where the latent topic structure is explored from n-gram events and employed for word prediction. In particular, the long-distance topic information is continuously updated from the large-span historical words and dynamically incorporated in generating the topic mixtures through Bayesian learning. The topic cache language model does effectively characterize the unseen n-gram events and catch the topic cache for long-distance language modeling. In the experiments on Wall Street Journal corpus, the proposed method achieves better performance than baseline n-gram and the other related language models in terms of perplexity and recognition accuracy.
Keywords :
Bayes methods; computational linguistics; learning (artificial intelligence); natural language processing; speech recognition; word processing; Bayesian learning; cache language model; large-span historical word; latent Dirichlet language; long-distance topic information; n-gram language model; speech recognition; Bayesian methods; Clustering methods; Computer science; Data mining; Linear discriminant analysis; Natural languages; Parameter estimation; Predictive models; Speech recognition; Statistics; Bayes procedure; Natural language; clustering method; smoothing method; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495011
Filename :
5495011
Link To Document :
بازگشت