DocumentCode
323757
Title
Full expansion of context-dependent networks in large vocabulary speech recognition
Author
Mohri, Mehryar ; Riley, Michael ; Hindle, Don ; Ljolje, Andrej ; Pereira, Fernando
Author_Institution
AT&T Labs., Florham Park, NJ, USA
Volume
2
fYear
1998
fDate
12-15 May 1998
Firstpage
665
Abstract
We combine our earlier approach to context-dependent network representation with our algorithm for determining weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fully-expanded networks have been used before in restrictive settings (medium vocabulary or no cross-word contexts), we demonstrate that our network determination method makes it practical to use fully-expanded networks also in large-vocabulary recognition with full cross-word context modeling. For the DARPA North American Business News task (NAB), we give network sizes and recognition speeds and accuracies using bigram and trigram grammars with vocabulary sizes ranging from 10000 to 160000 words. With our construction, the fully-expanded NAB context-dependent networks contain only about twice as many arcs as the corresponding language models. Interestingly, we also find that, with these networks, real-time word accuracy is improved by increasing the vocabulary size and n-gram order
Keywords
context-sensitive grammars; finite automata; natural languages; optimisation; speech recognition; DARPA North American Business News task; algorithm; bigram grammar; context dependent phone models; context-dependency modeling; context-dependent networks; cross-word context modeling; fully-expanded networks; large vocabulary speech recognition; n-gram language model; n-gram order; network determination method; network sizes; optimized networks; pronunciation dictionary; real-time word accuracy; recognition speeds; trigram grammar; vocabulary sizes; weighted automata; weighted finite-state transducers; Acoustic transducers; Business communication; Context modeling; Dictionaries; Intelligent networks; Natural languages; Runtime; Speech recognition; Viterbi algorithm; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.675352
Filename
675352
Link To Document