• DocumentCode
    3423704
  • Title

    Adaptation of compressed HMM parameters for resource-constrained speech recognition

  • Author

    Li, Jinyu ; Deng, Li ; Yu, Dong ; Wu, Jian ; Gong, Yifan ; Acero, Alex

  • Author_Institution
    Microsoft Corp., Redmond, WA
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4333
  • Lastpage
    4336
  • Abstract
    Recently, we successfully developed and reported a new unsupervised online adaptation technique, which jointly compensates for additive and convolutive distortions with vector Taylor series (JAC/VTS), to adjust (uncompressed) HMMs under acoustically distorted environments. In this paper, we extend that technique to adapt compressed HMMs using JAC/VTS where limited computation and/or memory resources are available for speech recognition (e.g., on mobile devices). Subspace coding (SSC) is developed and used to quantize each dimension of the multivariate Gaussians in the compressed HMMs. Three algorithmic design options are proposed and evaluated that combine SSC with JAC/VTS, where three different types of tradeoffs are made between recognition accuracy and the required computation/memory/storage resources. The strengths and weaknesses of these three options are discussed and shown on the Aurora2 task of noise-robust speech recognition. The first option greatly reduces the storage space and gives 93.2% accuracy, which is the same as the baseline accuracy but with little reduction in the run-time computation/memory cost. The second option reduces about 79.9% of the computation cost and about 33.5% of the memory requirement at a very small price of 0.5% decrease of accuracy (to 92.7%). The third option cuts about 89.2% of the computation cost and about 65.5% of the memory requirement while reducing recognition accuracy by 2.7% (to 90.5%).
  • Keywords
    Gaussian processes; data compression; distortion; hidden Markov models; speech coding; speech recognition; unsupervised learning; additive distortion; compressed HMM parameters; convolutive distortion; multivariate Gaussian process; quantization; resource-constrained speech recognition; subspace coding; unsupervised online adaptation technique; vector Taylor series; Acoustic distortion; Algorithm design and analysis; Computational efficiency; Gaussian processes; Hidden Markov models; Mobile computing; Noise robustness; Runtime; Speech recognition; Taylor series; additive and convolutive distortions; joint compensation; mobile devices; resource constraint; subspace coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518614
  • Filename
    4518614