• DocumentCode
    3523211
  • Title

    Iterative normalization for speaker-adaptive training in continuous speech recognition

  • Author

    Feng, Ming- Whei ; Schwartz, Richard ; Kubala, Francis ; Makhoul, John

  • Author_Institution
    Northeastern Univ., Boston, MA, USA
  • fYear
    1989
  • fDate
    23-26 May 1989
  • Firstpage
    612
  • Abstract
    The authors present several techniques to improve an algorithm presented last year for speaker-adaptive training in continuous speech recognition. The previous method uses a transformation matrix to modify the hidden Markov model (HMM) parameters of a prechosen prototype speaker to model a target speaker. To estimate the transformation matrix, it aligns a set of target speech with the same set of speech uttered by the prototype speaker using dynamic time warping. The authors focus on improving the previous method in the modeling of the spectral differences between two speakers, and the accuracy of the alignment. To improve the modeling of the spectral differences, they implemented a phoneme-dependent mapping procedure which transforms the prototype HMMs to the estimated target HMMs using a set of phoneme-dependent matrices. To improve the alignment, the authors developed a modeling of the silence, a linear duration normalization, and an iterative normalization procedure. They tested the new methods in the standard DARPA database with a grammar of perplexity 60. The performance shows a 30% word-error reduction compared to the previous algorithm
  • Keywords
    iterative methods; speech recognition; DARPA database; continuous speech recognition; dynamic time warping; grammar; hidden Markov model; iterative normalization; linear duration normalization; modeling; phoneme-dependent mapping; phoneme-dependent matrices; silence; speaker-adaptive training; transformation matrix; word-error reduction; Databases; Degradation; Hidden Markov models; Iterative algorithms; Laboratories; Maximum likelihood estimation; Prototypes; Speech recognition; Testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
  • Conference_Location
    Glasgow
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.1989.266501
  • Filename
    266501