A comparison of dynamic WFST decoding approaches

Author

Dixon, Paul R. ; Hori, Chiori ; Kashioka, Hideki

Author_Institution

Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan

fYear

2012

fDate

25-30 March 2012

Firstpage

4209

Lastpage

4212

Abstract

In this paper we perform a comparison of lookahead composition and on-the-fly hypothesis rescoring using a common decoder. The results on a large vocabulary speech recognition task illustrate the differences in the behaviour of these algorithms in terms of error rate, real time factor, memory usage and internal statistics of the decoder. The evaluations were performed when the decoder was operated at either the state or arc level. The results show the dynamic approaches also work well at the state level even though there is greater dynamic construction cost.

Keywords

error statistics; speech coding; speech recognition; arc level; decoder; dynamic WFST decoding; dynamic construction cost; error rate; internal statistics; large vocabulary speech recognition task; lookahead composition; memory usage; on-the-fly hypothesis rescoring; real time factor; state level; weighted finite state transducer; Acoustic beams; Acoustics; Decoding; Heuristic algorithms; Speech recognition; Transducers; Vocabulary; Speech recognition; WFST; on-the-fly composition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6288847

Filename

6288847