DocumentCode :
3151916
Title :
Learning improved linear transforms for speech recognition
Author :
Senior, Andrew ; Cho, Youngmin ; Weston, Jason
Author_Institution :
Google Inc., New York, NY, USA
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
1957
Lastpage :
1960
Abstract :
This paper explores a novel large margin approach to learning a linear transform for dimensionality reduction in speech recognition. The method assumes a trained Gaussian mixture model for each class to be discriminated and trains a dimensionality-reducing linear transform with respect to the fixed model, optimizing a hinge loss on the difference between the distance to the nearest in- and out-of-class Gaussians using stochastic gradient descent. Results are presented showing that the learnt transform improves state classification for individual frames and reduces word error rate compared to Linear Discriminant Analysis (LDA) in a large vocabulary speech recognition problem even after discriminative training.
Keywords :
Gaussian processes; gradient methods; speech recognition; stochastic processes; transforms; vocabulary; LDA; dimensionality-reducing linear transform; discriminative training; hinge loss optimization; individual frame classification; learning improved linear transform; linear discriminant analysis; nearest in-of-class Gaussian; nearest out-of-class Gaussian; stochastic gradient descent; trained Gaussian mixture model; vocabulary speech recognition problem; word error rate reduction; Error analysis; Hidden Markov models; Speech; Speech recognition; Training; Transforms; LDA; Linear discriminant analysis; margin Mahalanobis distance; speech feature transformation; stochastic gradient descent;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288289
Filename :
6288289
Link To Document :
بازگشت