مرکز منطقه ای اطلاع رساني علوم و فناوري - Continuous space language modeling techniques

DocumentCode :

2789613

Title :

Continuous space language modeling techniques

Author :

Sarikaya, Ruhi ; Emami, Ahmad ; Afify, Mohamed ; Ramabhadran, Bhuvana

Author_Institution :

IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

5186

Lastpage :

5189

Abstract :

This paper focuses on comparison of two continuous space language modeling techniques, namely Tied-Mixture Language modeling (TMLM) and Neural Network Based Language Modeling (NNLM). Additionally, we report on using alternative feature representations for words and histories used in TMLM. Besides bigram co-occurrence based features we consider using NNLM based input features for training TMLMs. We also describe how we improve certain steps in building TMLMs. We demonstrate that TMLMs provide significant improvements of over 16% relative and 10% relative in Character Error Rate (CER) for Mandarin speech recognition, over the trigram and NNLM models, respectively in a speech to speech translation task.

Keywords :

natural language processing; neural nets; speech processing; speech recognition; Mandarin speech recognition; bigram co-occurrence; character error rate; continuous space language modeling techniques; feature representations; neural network based language modeling; speech-to-speech translation task; tied mixture language modeling; Degradation; Error analysis; Hidden Markov models; History; Maximum likelihood decoding; Natural language processing; Natural languages; Neural networks; Speech recognition; Training data; Continuous Space Modeling; Language Modeling; NNLM; Tied-Mixture Modeling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495009

Filename :

5495009

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2789613