• DocumentCode
    2789658
  • Title

    Topic cache language model for speech recognition

  • Author

    Chueh, Chuang-Hua ; Chien, Jen-Tzung

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5194
  • Lastpage
    5197
  • Abstract
    Traditional n-gram language models suffer from insufficient long-distance information. The cache language model, which captures the dynamics of word occurrences in a cache, is feasible to compensate this weakness. This paper presents a new topic cache model for speech recognition based on the latent Dirichlet language model where the latent topic structure is explored from n-gram events and employed for word prediction. In particular, the long-distance topic information is continuously updated from the large-span historical words and dynamically incorporated in generating the topic mixtures through Bayesian learning. The topic cache language model does effectively characterize the unseen n-gram events and catch the topic cache for long-distance language modeling. In the experiments on Wall Street Journal corpus, the proposed method achieves better performance than baseline n-gram and the other related language models in terms of perplexity and recognition accuracy.
  • Keywords
    Bayes methods; computational linguistics; learning (artificial intelligence); natural language processing; speech recognition; word processing; Bayesian learning; cache language model; large-span historical word; latent Dirichlet language; long-distance topic information; n-gram language model; speech recognition; Bayesian methods; Clustering methods; Computer science; Data mining; Linear discriminant analysis; Natural languages; Parameter estimation; Predictive models; Speech recognition; Statistics; Bayes procedure; Natural language; clustering method; smoothing method; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495011
  • Filename
    5495011