• DocumentCode
    2239303
  • Title

    Discriminative optimisation of large vocabulary recognition systems

  • Author

    Valtchev, V. ; Woodland, P.C. ; Young, S.J.

  • Author_Institution
    Dept. of Chem. Eng., Cambridge Univ., UK
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    18
  • Abstract
    Describes a framework for optimising the structure and parameters of a continuous-density HMM-based large-vocabulary speech recognition system using the maximum mutual information estimation (MMIE) criterion. To reduce the computational complexity of the MMIE training algorithm, confusable segments of speech are identified and stored as word lattices of alternative utterance hypotheses. An iterative mixture splitting procedure is also employed to adjust the number of mixture components in each state during training such that the optimal balance between number of parameters and available training data is achieved. Experiments are presented on various test sets from the Wall Street Journal database using the full SI-284 training set. These show that the use of lattices makes MMIE training practicable for very complex recognition systems and large training sets. Furthermore, experimental results demonstrate that MMIE optimisation of system structure and parameters can yield useful increases in recognition accuracy
  • Keywords
    computational complexity; hidden Markov models; optimisation; speech recognition; vocabulary; MMIE training algorithm; SI-284 training set; Wall Street Journal database; alternative utterance hypotheses; computational complexity; confusable speech segments; continuous-density HMM-based large-vocabulary speech recognition system; discriminative optimization; hidden Markov model; iterative mixture splitting procedure; maximum mutual information estimation criterion; mixture components; parameter optimization; recognition accuracy; system structure optimization; word lattices; Computational complexity; Databases; Hidden Markov models; Iterative algorithms; Lattices; Maximum likelihood estimation; Mutual information; Speech recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.606992
  • Filename
    606992