• DocumentCode
    542288
  • Title

    Doing away with the Viterbi approximation

  • Author

    Demuynck, Kris ; Van Compernolle, Dirk ; Wambacq, Patrick

  • Author_Institution
    Katholieke Universiteit Leuven - ESAT, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium
  • Volume
    1
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    In this paper, we investigate the use of the total likelihood (the weighted sum of the likelihoods of all possible state sequences) instead of the approximation with the Viterbi likelihood (the like-lihood of the best state sequence) normally used in speech recognition. Next to its use in a recognizer, the use of total likelihoods in the context of an automatic word aligrunent task is also addressed shortly. We describe how the search algorithm must be modified and how word lattices based on total likelihoods can be constructed. The total likelihood framework also requires us to make a distinction between upgrading the language model scores or downgrading the acoustic model scores in the recognizer. To help in deciding between these two alternatives, some theoretical foundation is given to the practice of making a weighted combination of language and acoustic scores. Finally, the total likelihood and the Viterbi framework are compared in terms of accuracy and computational effort on the Wall Street Journal recognition task, while the accuracy of word alignments is evaluated on a large Dutch corpus.
  • Keywords
    Acoustics; Approximation algorithms; Lattices; Speech; Speech recognition; Viterbi algorithm; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5743818
  • Filename
    5743818