DocumentCode :
3426616
Title :
Recasting the discriminative n-gram model as a pseudo-conventional n-gram model for LVCSR
Author :
Zhou, Zhengyu ; Meng, Helen
Author_Institution :
Chinese Hong Kong Univ., Hong Kong
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4933
Lastpage :
4936
Abstract :
ABSTRACT Discriminative n-gram language modeling has been used to re-rank candidate recognition hypotheses for performance improvements in large vocabulary continuous speech recognition (LVCSR). Discriminative n-gram modeling is defined in a linear framework. This work demonstrates that the linear discriminative n-gram model can be recast as a pseudo-conventional n-gram model if the order of the discriminative n-gram model is no higher than the order of the n-gram model in the baseline recognizer. Thus the power of discriminative n-gram model can be captured by mature n-gram related techniques such as single-pass n-gram decoding or lattice rescoring. This work utilizes the pseudo-conventional n-gram model to rescore the recognition lattices that are generated during decoding. Compared to the discriminative N-best re-ranking, this process of discriminative lattice rescoring (DLR) has two positive advantages: (1) Those discriminatively top-ranked utterance hypotheses within the lattice search spaces can be efficiently identified by the A* algorithm; (2) The rescored lattices can be further enhanced with other post-processing techniques to achieve cumulative improvement conveniently. Experiments with Mandarin LVCSR show that DLR improves efficiency - the computation time for 1000-best re-ranking is reduced by more than three-fold. The discriminatively rescored lattices are further processed by re-ranking with word-based mutual information (MI). While the DLR achieves around 15% relative character error rate (CER) reductions over the recognizer baseline, the MI based re-ranking further brings 5% relative CER reductions over the DLR performances.
Keywords :
maximum likelihood estimation; speech recognition; A* algorithm; LVCSR; discriminative lattice rescoring; discriminative n-gram language modeling; large vocabulary continuous speech recognition; pseudo-conventional n-gram model; re-rank candidate recognition hypotheses; word-based mutual information; Character recognition; Error analysis; Hidden Markov models; Lattices; Maximum likelihood decoding; Maximum likelihood estimation; Mutual information; Natural languages; Speech recognition; Vocabulary; Discriminative N-gram Modeling; LVCSR;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518764
Filename :
4518764
Link To Document :
بازگشت