DocumentCode :
542288
Title :
Doing away with the Viterbi approximation
Author :
Demuynck, Kris ; Van Compernolle, Dirk ; Wambacq, Patrick
Author_Institution :
Katholieke Universiteit Leuven - ESAT, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
In this paper, we investigate the use of the total likelihood (the weighted sum of the likelihoods of all possible state sequences) instead of the approximation with the Viterbi likelihood (the like-lihood of the best state sequence) normally used in speech recognition. Next to its use in a recognizer, the use of total likelihoods in the context of an automatic word aligrunent task is also addressed shortly. We describe how the search algorithm must be modified and how word lattices based on total likelihoods can be constructed. The total likelihood framework also requires us to make a distinction between upgrading the language model scores or downgrading the acoustic model scores in the recognizer. To help in deciding between these two alternatives, some theoretical foundation is given to the practice of making a weighted combination of language and acoustic scores. Finally, the total likelihood and the Viterbi framework are compared in terms of accuracy and computational effort on the Wall Street Journal recognition task, while the accuracy of word alignments is evaluated on a large Dutch corpus.
Keywords :
Acoustics; Approximation algorithms; Lattices; Speech; Speech recognition; Viterbi algorithm; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743818
Filename :
5743818
Link To Document :
بازگشت