مرکز منطقه ای اطلاع رساني علوم و فناوري - Adaptation of compressed HMM parameters for resource-constrained speech recognition

DocumentCode :

3423704

Title :

Adaptation of compressed HMM parameters for resource-constrained speech recognition

Author :

Li, Jinyu ; Deng, Li ; Yu, Dong ; Wu, Jian ; Gong, Yifan ; Acero, Alex

Author_Institution :

Microsoft Corp., Redmond, WA

fYear :

2008

fDate :

March 31 2008-April 4 2008

Firstpage :

4333

Lastpage :

4336

Abstract :

Recently, we successfully developed and reported a new unsupervised online adaptation technique, which jointly compensates for additive and convolutive distortions with vector Taylor series (JAC/VTS), to adjust (uncompressed) HMMs under acoustically distorted environments. In this paper, we extend that technique to adapt compressed HMMs using JAC/VTS where limited computation and/or memory resources are available for speech recognition (e.g., on mobile devices). Subspace coding (SSC) is developed and used to quantize each dimension of the multivariate Gaussians in the compressed HMMs. Three algorithmic design options are proposed and evaluated that combine SSC with JAC/VTS, where three different types of tradeoffs are made between recognition accuracy and the required computation/memory/storage resources. The strengths and weaknesses of these three options are discussed and shown on the Aurora2 task of noise-robust speech recognition. The first option greatly reduces the storage space and gives 93.2% accuracy, which is the same as the baseline accuracy but with little reduction in the run-time computation/memory cost. The second option reduces about 79.9% of the computation cost and about 33.5% of the memory requirement at a very small price of 0.5% decrease of accuracy (to 92.7%). The third option cuts about 89.2% of the computation cost and about 65.5% of the memory requirement while reducing recognition accuracy by 2.7% (to 90.5%).

Keywords :

Gaussian processes; data compression; distortion; hidden Markov models; speech coding; speech recognition; unsupervised learning; additive distortion; compressed HMM parameters; convolutive distortion; multivariate Gaussian process; quantization; resource-constrained speech recognition; subspace coding; unsupervised online adaptation technique; vector Taylor series; Acoustic distortion; Algorithm design and analysis; Computational efficiency; Gaussian processes; Hidden Markov models; Mobile computing; Noise robustness; Runtime; Speech recognition; Taylor series; additive and convolutive distortions; joint compensation; mobile devices; resource constraint; subspace coding;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location :

Las Vegas, NV

ISSN :

1520-6149

Print_ISBN :

978-1-4244-1483-3

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2008.4518614

Filename :

4518614

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3423704