DocumentCode :
2053683
Title :
PLSA enhanced with a long-distance bigram language model for speech recognition
Author :
Haidar, Md Akmal ; O´Shaughnessy, D.
Author_Institution :
INRS-EMT, Montreal, QC, Canada
fYear :
2013
fDate :
9-13 Sept. 2013
Firstpage :
1
Lastpage :
5
Abstract :
We propose a language modeling (LM) approach using background n-grams and interpolated distanced n-grams for speech recognition using an enhanced probabilistic latent semantic analysis (EPLSA) derivation. PLSA is a bag-of-words model that exploits the topic information at the document level, which is inconsistent for the language modeling in speech recognition. In this paper, we consider the word sequence in modeling the EPLSA model. Here, the predicted word of an n-gram event is drawn from a topic that is chosen from the topic distribution of the (n-1) history words. The EPLSA model cannot capture the long-range topic information from outside of the n-gram event. The distanced n-grams are incorporated into interpolated form (IEPLSA) to cover the long-range information. A cache-based LM that models the re-occurring words is also incorporated through unigram scaling to the EPLSA and IEPLSA models, which models the topical words. We have seen that our proposed approaches yield significant reductions in perplexity and word error rate (WER) over a PLSA based LM approach using the Wall Street Journal (WSJ) corpus.
Keywords :
cache storage; natural language processing; programming language semantics; speech recognition; EPLSA derivation; IEPLSA models; WER; background n-grams; bag-of-words model; cache-based LM; enhanced probabilistic latent semantic analysis; interpolated distanced n-grams; long-distance bigram language model; long-range information; speech recognition; unigram scaling; word error rate; word sequence; Adaptation models; Computational modeling; History; Mathematical model; Semantics; Speech; Speech recognition; cache-based LM; language model; long-distance n-grams; speech recognition; topic model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European
Conference_Location :
Marrakech
Type :
conf
Filename :
6811450
Link To Document :
بازگشت