• DocumentCode
    1556772
  • Title

    Online unsupervised learning of hidden Markov models for adaptive speech recognition

  • Author

    Chien, J.-T.

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
  • Volume
    148
  • Issue
    5
  • fYear
    2001
  • fDate
    10/1/2001 12:00:00 AM
  • Firstpage
    315
  • Lastpage
    324
  • Abstract
    A novel framework of an online unsupervised learning algorithm is presented to flexibly adapt the existing speaker-independent hidden Markov models (HMMs) to nonstationary environments induced by varying speakers, transmission channels, ambient noises, etc. The quasi-Bayes (QB) estimate is applied to incrementally obtain word sequence and adaptation parameters for adjusting HMMs when a block of unlabelled data is enrolled. The underlying statistics of a nonstationary environment can be successively traced according to the newest enrolment data. To improve the QB estimate, the adaptive initial hyperparameters are employed in the beginning session of online learning. These hyperparameters are estimated from a cluster of training speakers closest to the test environment. Additionally, a selection process is developed to select reliable parameters from a list of candidates for unsupervised learning. A set of reliability assessment criteria is explored for selection. In a series of speaker adaptation experiments, the effectiveness of the proposed method is confirmed and it is found that using the adaptive initial hyperparameters in online learning and the multiple assessments in parameter selection can improve the recognition performance
  • Keywords
    Bayes methods; adaptive signal processing; online operation; parameter estimation; speech recognition; unsupervised learning; adaptive initial hyperparameters; adaptive speech recognition; ambient noise; automatic speech recognition; enrolment data; hidden Markov models; nonstationary environments; online unsupervised learning; parameter selection; quasi-Bayes estimate; recognition performance; reliability assessment criteria; speaker adaptation experiments; speaker-independent HMM; statistics; test environment; transmission channels; unlabelled data; word sequence parameters;
  • fLanguage
    English
  • Journal_Title
    Vision, Image and Signal Processing, IEE Proceedings -
  • Publisher
    iet
  • ISSN
    1350-245X
  • Type

    jour

  • DOI
    10.1049/ip-vis:20010560
  • Filename
    974391