Author :
Xu, P. ; Khudanpur, S. ; Lehr, M. ; Prud´hommeaux, E. ; Glenn, N. ; Karakos, D. ; Roark, B. ; Sagae, K. ; Saraçlar, M. ; Shafran, I. ; Bikel, D. ; Callison-Burch, C. ; Cao, Y. ; Hall, K. ; Hasler, E. ; Koehn, P. ; Lopez, A. ; Post, M. ; Riley, D.
Abstract :
Discriminative language modeling is a structured classification problem. Log-linear models have been previously used to address this problem. In this paper, the standard dot-product feature representation used in log-linear models is replaced by a non-linear function parameterized by a neural network. Embeddings are learned for each word and features are extracted automatically through the use of convolutional layers. Experimental results show that as a stand-alone model the continuous space model yields significantly lower word error rate (1% absolute), while having a much more compact parameterization (60%-90% smaller). If the baseline scores are combined, our approach performs equally well.
Keywords :
computational linguistics; neural nets; pattern classification; continuous space discriminative language modeling; continuous space model; convolutional layers; log-linear models; neural network; nonlinear function; standard dot-product feature representation; structured classification problem; word error rate; Computer architecture; Feature extraction; Neural networks; Speech; Standards; Training; Vectors; Discriminative language modeling; neural network;