Title :
Corrected tandem features for acoustic model training
Author :
Faria, Arlo ; Morgan, Nelson
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA
fDate :
March 31 2008-April 4 2008
Abstract :
This paper describes a simple method for significantly improving tandem features used to train acoustic models for large-vocabulary speech recognition. The linear activations at the outputs of an MLP classifier were modified according to known reference labels: where necessary, the activation of the output unit corresponding to the correct phone label was increased in order to make an accurate classification. This technique was inspired by another experiment that determined a lower error bound on ASR performance within the tandem framework. By simulating an idealized classifier with forward-backward phone posterior probabilities, we observed a best-case scenario in which nearly all errors were eliminated. Although this performance is not practically achievable, the experiment demonstrated the validity of the Tandem processing approach and suggested that considerable gains are possible by improving the MLP phone classifier.
Keywords :
acoustic signal processing; multilayer perceptrons; probability; speech recognition; acoustic model training; acoustic models; forward-backward phone posterior probabilities; large-vocabulary speech recognition; multilayer perceptron; tandem features; Automatic speech recognition; Broadcasting; Error analysis; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Multilayer perceptrons; Performance gain; Speech recognition; Testing; Hidden Markov models; feature extraction; multilayer perceptrons; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518715