Paraphrastic neural network language models

Author

Liu, Xindong ; Gales, Mark J.F. ; Woodland, Philip C.

Author_Institution

Eng. Dept., Cambridge Univ., Cambridge, UK

fYear

2014

fDate

4-9 May 2014

Firstpage

4903

Lastpage

4907

Abstract

Expressive richness in natural languages presents a significant challenge for statistical language models (LM). As multiple word sequences can represent the same underlying meaning, only modelling the observed surface word sequence can lead to poor context coverage. To handle this issue, paraphrastic LMs were previously proposed to improve the generalization of back-off n-gram LMs. Paraphrastic neural network LMs (NNLM) are investigated in this paper. Using a paraphrastic multi-level feedforward NNLM modelling both word and phrase sequences, significant error rate reductions of 1.3% absolute (8% relative) and 0.9% absolute (5.5% relative) were obtained over the baseline n-gram and NNLM systems respectively on a state-of-the-art conversational telephone speech recognition system trained on 2000 hours of audio and 545 million words of texts.

Keywords

natural language processing; neural nets; speech recognition; back-off n-gram LMs; error rate reductions; multiple word sequences; natural languages; observed surface word sequence; paraphrastic LMs; paraphrastic multi-level feedforward NNLM modelling; paraphrastic neural network language models; statistical language models; telephone speech recognition system; time 2000 hour; Artificial neural networks; Computational modeling; Context; Feedforward neural networks; Lattices; Mathematical model; neural network language model; paraphrase; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854534

Filename

6854534