• DocumentCode
    2280674
  • Title

    Dynamic sharings of Gaussian densities using phonetic features

  • Author

    Lee, Kyung-Tak ; Wellekens, Christian J.

  • Author_Institution
    Inst. Eurecom, Sophia Antipolis, France
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    425
  • Lastpage
    428
  • Abstract
    This paper describes a way to adapt the recognizer to pronunciation variability by dynamically sharing Gaussian densities across phonetic models. The method is divided in three steps. First, given an input utterance, an HMM recognizer outputs a lattice of the most likely word hypotheses. Then, the canonical pronunciation of each hypothesis is checked by comparing its theoretical phonetic features to those automatically extracted from speech. If the comparisons show that a phoneme of an hypothesis has likely been pronounced differently, its model is transformed by sharing its Gaussian densities with the ones of its possible alternate phone realization(s). Finally, the transformed models are used in a second-pass recognition. Sharings are dynamic because they are automatically adapted to each input speech. Experiments showed a 5.4% relative reduction in word error rate compared to the baseline and a 2.7% compared to a static method.
  • Keywords
    Gaussian distribution; error statistics; feature extraction; hidden Markov models; speech recognition; Gaussian densities; HMM; feature extraction; hidden Markov model; phonetic features; pronunciation variability; speech recognition; word error rate; Automatic speech recognition; Error analysis; Hidden Markov models; Humans; Lattices; Robustness; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
  • Print_ISBN
    0-7803-7343-X
  • Type

    conf

  • DOI
    10.1109/ASRU.2001.1034675
  • Filename
    1034675