Title : 
Topic cache language model for speech recognition
         
        
            Author : 
Chueh, Chuang-Hua ; Chien, Jen-Tzung
         
        
            Author_Institution : 
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
         
        
        
        
        
        
            Abstract : 
Traditional n-gram language models suffer from insufficient long-distance information. The cache language model, which captures the dynamics of word occurrences in a cache, is feasible to compensate this weakness. This paper presents a new topic cache model for speech recognition based on the latent Dirichlet language model where the latent topic structure is explored from n-gram events and employed for word prediction. In particular, the long-distance topic information is continuously updated from the large-span historical words and dynamically incorporated in generating the topic mixtures through Bayesian learning. The topic cache language model does effectively characterize the unseen n-gram events and catch the topic cache for long-distance language modeling. In the experiments on Wall Street Journal corpus, the proposed method achieves better performance than baseline n-gram and the other related language models in terms of perplexity and recognition accuracy.
         
        
            Keywords : 
Bayes methods; computational linguistics; learning (artificial intelligence); natural language processing; speech recognition; word processing; Bayesian learning; cache language model; large-span historical word; latent Dirichlet language; long-distance topic information; n-gram language model; speech recognition; Bayesian methods; Clustering methods; Computer science; Data mining; Linear discriminant analysis; Natural languages; Parameter estimation; Predictive models; Speech recognition; Statistics; Bayes procedure; Natural language; clustering method; smoothing method; speech recognition;
         
        
        
        
            Conference_Titel : 
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
         
        
            Conference_Location : 
Dallas, TX
         
        
        
            Print_ISBN : 
978-1-4244-4295-9
         
        
            Electronic_ISBN : 
1520-6149
         
        
        
            DOI : 
10.1109/ICASSP.2010.5495011