• DocumentCode
    336741
  • Title

    HMM training based on quality measurement

  • Author

    Gao, Yuqing ; Jan, Ea-Ee ; Padmanabhan, Mukund ; Picheny, Michael

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    129
  • Abstract
    Two discriminant measures for HMM states to improve the effectiveness on HMM training are presented. In HMM based speech recognition, the context-dependent states are usually modeled by Gaussian mixture distributions. In general, the number of Gaussian mixtures for each state is fixed or proportional to the amount of training data. From our study, some of the states are “non-aggressive” compared to others, and a higher acoustic resolution is required for them. Two methods are presented in this paper to determine those non-aggressive states. The first approach uses the recognition accuracy of the states and the second method is based on a rank distribution of states. Baseline systems, trained by a fixed number of Gaussian mixtures for each state, having 33 K and 120 K Gaussians, yield 14.57% and 13.04% word error rates, respectively. Using our approach, a 38 K Gaussian system was constructed that reduces the error rate to 13.95%. The average ranks of non-aggressive states in rank lists of testing data were also seen to dramatic improve compared to the baseline systems
  • Keywords
    Gaussian processes; error statistics; hidden Markov models; speech recognition; Gaussian mixture distributions; HMM states; HMM training; acoustic resolution; average ranks; baseline systems; context-dependent states; discriminant measures; error rate reduction; nonaggressive states; quality measurement; rank distribution; rank lists; recognition accuracy; speech recognition; testing data; training data; word error rates; Acoustic applications; Acoustic measurements; Acoustic testing; Context modeling; Error analysis; Hidden Markov models; Speech recognition; System performance; System testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758079
  • Filename
    758079