DocumentCode :
3423704
Title :
Adaptation of compressed HMM parameters for resource-constrained speech recognition
Author :
Li, Jinyu ; Deng, Li ; Yu, Dong ; Wu, Jian ; Gong, Yifan ; Acero, Alex
Author_Institution :
Microsoft Corp., Redmond, WA
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4333
Lastpage :
4336
Abstract :
Recently, we successfully developed and reported a new unsupervised online adaptation technique, which jointly compensates for additive and convolutive distortions with vector Taylor series (JAC/VTS), to adjust (uncompressed) HMMs under acoustically distorted environments. In this paper, we extend that technique to adapt compressed HMMs using JAC/VTS where limited computation and/or memory resources are available for speech recognition (e.g., on mobile devices). Subspace coding (SSC) is developed and used to quantize each dimension of the multivariate Gaussians in the compressed HMMs. Three algorithmic design options are proposed and evaluated that combine SSC with JAC/VTS, where three different types of tradeoffs are made between recognition accuracy and the required computation/memory/storage resources. The strengths and weaknesses of these three options are discussed and shown on the Aurora2 task of noise-robust speech recognition. The first option greatly reduces the storage space and gives 93.2% accuracy, which is the same as the baseline accuracy but with little reduction in the run-time computation/memory cost. The second option reduces about 79.9% of the computation cost and about 33.5% of the memory requirement at a very small price of 0.5% decrease of accuracy (to 92.7%). The third option cuts about 89.2% of the computation cost and about 65.5% of the memory requirement while reducing recognition accuracy by 2.7% (to 90.5%).
Keywords :
Gaussian processes; data compression; distortion; hidden Markov models; speech coding; speech recognition; unsupervised learning; additive distortion; compressed HMM parameters; convolutive distortion; multivariate Gaussian process; quantization; resource-constrained speech recognition; subspace coding; unsupervised online adaptation technique; vector Taylor series; Acoustic distortion; Algorithm design and analysis; Computational efficiency; Gaussian processes; Hidden Markov models; Mobile computing; Noise robustness; Runtime; Speech recognition; Taylor series; additive and convolutive distortions; joint compensation; mobile devices; resource constraint; subspace coding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518614
Filename :
4518614
Link To Document :
بازگشت