مرکز منطقه ای اطلاع رساني علوم و فناوري - The use of virtual hypothesis copies in decoding of large-vocabulary continuous speech

DocumentCode :

939980

Title :

The use of virtual hypothesis copies in decoding of large-vocabulary continuous speech

Author :

Seide, Frank

Author_Institution :

Microsoft Res. Asia, Beijing, China

Volume :

Issue :

fYear :

2005

fDate :

7/1/2005 12:00:00 AM

Firstpage :

520

Lastpage :

533

Abstract :

High computational effort hinders wide-spread deployment of large-vocabulary continuous-speech recognition (LVCSR), for example in home or mobile devices. To this end, we developed a novel approach to LVCSR Viterbi decoding with significantly reduced effort. By a novel search-space organization called virtual hypothesis copies, we eliminate search-space copies that are approximately redundant: 1) Word-lattice generation and (M+1)-gram lattice rescoring are integrated into a single-pass time-synchronous beam search. Hypothesis copying becomes independent from the language-model order. 2) The word-pair approximation is replaced by the novel phone-history approximation (PHA). Tree copies are shared among multiple linguistic histories that end in the same phone(s). 3) Copies of individual tree arcs are shared by recombining within-word hypotheses at phone boundaries according to the PHA. At no loss of accuracy, we achieve a search-space reduction of 60-80% for Mandarin LVCSR, and of 40-50% for English (NAB 64 K). The method is exact under certain model assumptions. A formal specification is derived. In addition, we propose an extremely effective syllable lookahead for Mandarin. Together with the methods above, search space was reduced 12-15 times and state likelihood evaluations 4-9 times without significant error increase.

Keywords :

Viterbi decoding; speech coding; speech recognition; fast decoding; large-vocabulary continuous speech recognition; novel phone-history approximation; single-pass time-synchronous beam search; virtual hypothesis copies; Decoding; Formal specifications; Hidden Markov models; History; Home computing; Lattices; Mobile computing; Natural languages; Speech recognition; Viterbi algorithm; Fast decoding; speech recognition; virtual hypothesis;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2005.848886

Filename :

1453595

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=939980