• DocumentCode
    1445832
  • Title

    A structural Bayes approach to speaker adaptation

  • Author

    Shinoda, Koichi ; Lee, Chin-Hui

  • Author_Institution
    Dept. of Comput. & Commun. Media Res., NEC Corp., Kawasaki, Japan
  • Volume
    9
  • Issue
    3
  • fYear
    2001
  • fDate
    3/1/2001 12:00:00 AM
  • Firstpage
    276
  • Lastpage
    287
  • Abstract
    Maximum a posteriori (MAP) estimation has been successfully applied to speaker adaptation in speech recognition systems using hidden Markov models. When the amount of data is sufficiently large, MAP estimation yields recognition performance as good as that obtained using maximum-likelihood (ML) estimation. This paper describes a structural maximum a posteriori (SMAP) approach to improve the MAP estimates obtained when the amount of adaptation data is small. A hierarchical structure in the model parameter space is assumed and the probability density functions for model parameters at one level are used as priors for those of the parameters at adjacent levels. Results of supervised adaptation experiments using nonnative speakers´ utterances showed that SMAP estimation reduced error rates by 61% when ten utterances were used for adaptation and that it yielded the same accuracy as MAP and ML estimation when the amount of data was sufficiently large. Furthermore, the recognition results obtained in unsupervised adaptation experiments showed that SMAP estimation was effective even when only one utterance from a new speaker was used for adaptation. An effective way to combine rapid supervised adaptation and on-line unsupervised adaptation was also investigated
  • Keywords
    Bayes methods; hidden Markov models; maximum likelihood estimation; speech recognition; MAP estimation; SMAP approach; error rates; hidden Markov models; hierarchical structure; maximum a posteriori estimation; model parameter space; nonnative speaker; probability density functions; speaker adaptation; speech recognition systems; structural Bayes approach; structural maximum a posteriori approach; supervised adaptation experiments; unsupervised adaptation experiments; utterances; Adaptation model; Degradation; Estimation error; Hidden Markov models; Loudspeakers; Maximum likelihood estimation; Parameter estimation; Probability density function; Speech recognition; Yield estimation;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.906001
  • Filename
    906001