Title :
Full expansion of context-dependent networks in large vocabulary speech recognition
Author :
Mohri, Mehryar ; Riley, Michael ; Hindle, Don ; Ljolje, Andrej ; Pereira, Fernando
Author_Institution :
AT&T Labs., Florham Park, NJ, USA
Abstract :
We combine our earlier approach to context-dependent network representation with our algorithm for determining weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fully-expanded networks have been used before in restrictive settings (medium vocabulary or no cross-word contexts), we demonstrate that our network determination method makes it practical to use fully-expanded networks also in large-vocabulary recognition with full cross-word context modeling. For the DARPA North American Business News task (NAB), we give network sizes and recognition speeds and accuracies using bigram and trigram grammars with vocabulary sizes ranging from 10000 to 160000 words. With our construction, the fully-expanded NAB context-dependent networks contain only about twice as many arcs as the corresponding language models. Interestingly, we also find that, with these networks, real-time word accuracy is improved by increasing the vocabulary size and n-gram order
Keywords :
context-sensitive grammars; finite automata; natural languages; optimisation; speech recognition; DARPA North American Business News task; algorithm; bigram grammar; context dependent phone models; context-dependency modeling; context-dependent networks; cross-word context modeling; fully-expanded networks; large vocabulary speech recognition; n-gram language model; n-gram order; network determination method; network sizes; optimized networks; pronunciation dictionary; real-time word accuracy; recognition speeds; trigram grammar; vocabulary sizes; weighted automata; weighted finite-state transducers; Acoustic transducers; Business communication; Context modeling; Dictionaries; Intelligent networks; Natural languages; Runtime; Speech recognition; Viterbi algorithm; Vocabulary;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675352