DocumentCode
3443294
Title
Generalized optimization algorithm for speech recognition transducers
Author
Allauzen, Cyril ; Mohri, Mehryar
Author_Institution
AT&T Labs.-Res., USA
Volume
1
fYear
2003
fDate
6-10 April 2003
Abstract
Weighted transducers provide a common representation for the components of a speech recognition system. In previous work, we showed that these components can be combined off-line into a single compact recognition transducer that maps directly HMM state sequences to word sequences. The construction of that recognition transducer and its efficiency of use critically depend on the use of a general optimization algorithm, determinization. However, not all weighted automata and transducers used in large-vocabulary speech recognition are determinizable. We present a general algorithm that can make an arbitrary weighted transducer determinizable and generalize our previous optimization technique for building an integrated recognition transducer to deal with arbitrary weighted transducers used in speech recognition. We report experimental results in a large- vocabulary speech recognition task, How May I Help You (HMIHY), showing that our generalized technique leads to a recognition transducer that performs as well as our original solution in the case of classical n-gram models while inserting less special symbols, and that it leads to a substantial improvement of the recognition speed, factor of 2.6, in the same task when using a class-based language model.
Keywords
grammars; hidden Markov models; speech recognition; transducers; HMM word sequences; class-based language model; compact recognition transducer; determinization; general optimization algorithm; generalized optimization algorithm; integrated recognition transducer; large-vocabulary speech recognition; n-gram models; original solution; recognition speed; speech recognition system; speech recognition transducers; state sequences; weighted automata; weighted transducers; Automata; Automatic speech recognition; Context modeling; Dictionaries; Helium; Hidden Markov models; Natural languages; Speech recognition; Transducers; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1198790
Filename
1198790
Link To Document