Title :
Corrective language modeling for large vocabulary ASR with the perceptron algorithm
Author :
Roark, Brian ; Saraclar, Murat ; Collins, Michael
Author_Institution :
AT&T Labs.-Res., USA
Abstract :
This paper investigates error-corrective language modeling using the perceptron algorithm on word lattices. The resulting model is encoded as a weighted finite-state automaton, and is used by intersecting the model with word lattices, making it simple and inexpensive to apply during decoding. We present results for various training scenarios for the Switchboard task, including using n-gram features of different orders, and performing n-best extraction versus using full word lattices. We demonstrate the importance of making the training conditions as close as possible to testing conditions. The best approach yields a 1.3 percent improvement in first pass accuracy, which translates to 0.5 percent improvement after other rescoring passes.
Keywords :
error correction; feature extraction; finite automata; learning (artificial intelligence); perceptrons; speech recognition; vocabulary; Switchboard task; automatic speech recognition; error-corrective language modeling; large vocabulary ASR; n-best extraction; n-gram features; perceptron algorithm; training; weighted finite-state automaton; word lattices; Artificial intelligence; Automata; Automatic speech recognition; Costs; Decoding; Hidden Markov models; Laboratories; Lattices; Testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326094