Title :
Improvement of non-negative matrix factorization based language model using exponential models
Author :
Novak, Miroslav ; Mammone, Richard
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
This paper describes the use of exponential models to improve non-negative matrix factorization (NMF) based topic language models for automatic speech recognition. This modeling technique borrows the basic idea from latent semantic analysis (LSA), which is typically used in information retrieval. An improvement was achieved when exponential models were used to estimate the a posteriori topic probabilities for an observed history. This method improved the perplexity of the NMF model, resulting in a 24% perplexity improvement overall when compared to a trigram language model.
Keywords :
linguistics; matrix decomposition; parameter estimation; probability; speech recognition; text analysis; a posteriori topic probabilities; automatic speech recognition; exponential models; information retrieval; latent semantic analysis; nonnegative matrix factorization; perplexity; topic language model; Automatic speech recognition; History; Information analysis; Information retrieval; Iterative algorithms; Natural languages; Parameter estimation; Singular value decomposition; Training data; Vectors;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN :
0-7803-7343-X
DOI :
10.1109/ASRU.2001.1034619