Title :
Doing away with the Viterbi approximation
Author :
Demuynck, Kris ; Van Compernolle, Dirk ; Wambacq, Patrick
Author_Institution :
Katholieke Universiteit Leuven - ESAT, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium
Abstract :
In this paper, we investigate the use of the total likelihood (the weighted sum of the likelihoods of all possible state sequences) instead of the approximation with the Viterbi likelihood (the like-lihood of the best state sequence) normally used in speech recognition. Next to its use in a recognizer, the use of total likelihoods in the context of an automatic word aligrunent task is also addressed shortly. We describe how the search algorithm must be modified and how word lattices based on total likelihoods can be constructed. The total likelihood framework also requires us to make a distinction between upgrading the language model scores or downgrading the acoustic model scores in the recognizer. To help in deciding between these two alternatives, some theoretical foundation is given to the practice of making a weighted combination of language and acoustic scores. Finally, the total likelihood and the Viterbi framework are compared in terms of accuracy and computational effort on the Wall Street Journal recognition task, while the accuracy of word alignments is evaluated on a large Dutch corpus.
Keywords :
Acoustics; Approximation algorithms; Lattices; Speech; Speech recognition; Viterbi algorithm; World Wide Web;
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.2002.5743818