مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition

DocumentCode :

779978

Title :

Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition

Author :

Hori, Takaaki ; Hori, Chiori ; Minami, Yasuhiro ; Nakamura, Atsushi

Author_Institution :

NTT Commun. Sci. Labs., NTT Corp., Kyoto

Volume :

Issue :

fYear :

2007

fDate :

5/1/2007 12:00:00 AM

Firstpage :

1352

Lastpage :

1365

Abstract :

This paper proposes a novel one-pass search algorithm with on-the-fly composition of weighted finite-state transducers (WFSTs) for large-vocabulary continuous-speech recognition. In the standard search method with on-the-fly composition, two or more WFSTs are composed during decoding, and a Viterbi search is performed based on the composed search space. With this new method, a Viterbi search is performed based on the first of the two WFSTs. The second WFST is only used to rescore the hypotheses generated during the search. Since this rescoring is very efficient, the total amount of computation required by the new method is almost the same as when using only the first WFST. In a 65k-word vocabulary spontaneous lecture speech transcription task, our proposed method significantly outperformed the standard search method. Furthermore, our method was faster than decoding with a single fully composed and optimized WFST, where our method used only 38% of the memory required for decoding with the single WFST. Finally, we have achieved high-accuracy one-pass real-time speech recognition with an extremely large vocabulary of 1.8 million words

Keywords :

Viterbi decoding; search problems; speech coding; speech recognition; transducers; Viterbi search; WFST-based one-pass decoding; large vocabulary continuous speech recognition; on-the-fly hypothesis rescoring; one-pass search algorithm; speech transcription; weighted finite-state transducers; Decoding; Hidden Markov models; Laboratories; Optimization methods; Search methods; Speech processing; Speech recognition; Transducers; Viterbi algorithm; Vocabulary; On-the-fly composition; speech recognition; weighted finite-state transducer (WFST);

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2006.889790

Filename :

4156199

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=779978