DocumentCode :
939980
Title :
The use of virtual hypothesis copies in decoding of large-vocabulary continuous speech
Author :
Seide, Frank
Author_Institution :
Microsoft Res. Asia, Beijing, China
Volume :
13
Issue :
4
fYear :
2005
fDate :
7/1/2005 12:00:00 AM
Firstpage :
520
Lastpage :
533
Abstract :
High computational effort hinders wide-spread deployment of large-vocabulary continuous-speech recognition (LVCSR), for example in home or mobile devices. To this end, we developed a novel approach to LVCSR Viterbi decoding with significantly reduced effort. By a novel search-space organization called virtual hypothesis copies, we eliminate search-space copies that are approximately redundant: 1) Word-lattice generation and (M+1)-gram lattice rescoring are integrated into a single-pass time-synchronous beam search. Hypothesis copying becomes independent from the language-model order. 2) The word-pair approximation is replaced by the novel phone-history approximation (PHA). Tree copies are shared among multiple linguistic histories that end in the same phone(s). 3) Copies of individual tree arcs are shared by recombining within-word hypotheses at phone boundaries according to the PHA. At no loss of accuracy, we achieve a search-space reduction of 60-80% for Mandarin LVCSR, and of 40-50% for English (NAB 64 K). The method is exact under certain model assumptions. A formal specification is derived. In addition, we propose an extremely effective syllable lookahead for Mandarin. Together with the methods above, search space was reduced 12-15 times and state likelihood evaluations 4-9 times without significant error increase.
Keywords :
Viterbi decoding; speech coding; speech recognition; fast decoding; large-vocabulary continuous speech recognition; novel phone-history approximation; single-pass time-synchronous beam search; virtual hypothesis copies; Decoding; Formal specifications; Hidden Markov models; History; Home computing; Lattices; Mobile computing; Natural languages; Speech recognition; Viterbi algorithm; Fast decoding; speech recognition; virtual hypothesis;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/TSA.2005.848886
Filename :
1453595
Link To Document :
بازگشت