مرکز منطقه ای اطلاع رساني علوم و فناوري - Feature-rich continuous language models for speech recognition

DocumentCode :

2330170

Title :

Feature-rich continuous language models for speech recognition

Author :

Mirowski, Piotr ; Chopra, Sumit ; Balakrishnan, Suhrid ; Bangalore, Srinivas

Author_Institution :

Courant Inst. of Math. Sci., New York Univ., New York, NY, USA

fYear :

2010

fDate :

12-15 Dec. 2010

Firstpage :

241

Lastpage :

246

Abstract :

State-of-the-art probabilistic models of text such as n-grams require an exponential number of examples as the size of the context grows, a problem that is due to the discrete word representation. We propose to solve this problem by learning a continuous-valued and low-dimensional mapping of words, and base our predictions for the probabilities of the target word on non-linear dynamics of the latent space representation of the words in context window. We build on neural networks-based language models; by expressing them as energy-based models, we can further enrich the models with additional inputs such as part-of-speech tags, topic information and graphs of word similarity. We demonstrate a significantly lower perplexity on different text corpora, as well as improved word accuracy rate on speech recognition tasks, as compared to Kneser-Ney back-off n-gram-based language models.

Keywords :

natural language processing; neural nets; probability; speech recognition; discrete word representation; energy based models; feature rich continuous language models; neural networks; speech recognition; state-of-the-art probabilistic models; topic information; Speech recognition; natural language; neural networks; probability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2010 IEEE

Conference_Location :

Berkeley, CA

Print_ISBN :

978-1-4244-7904-7

Electronic_ISBN :

978-1-4244-7902-3

Type :

conf

DOI :

10.1109/SLT.2010.5700858

Filename :

5700858

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2330170