Title :
Affine invariant sparse maximum a posteriori adaptation
Author :
Olsen, Peder A. ; Huang, Jing ; Rennie, Steven J. ; Goel, Vaibhava
Author_Institution :
T.J. Watson Res. Center, Dept. of Speech & Language Algorithms, IBM, Yorktown Heights, NY, USA
Abstract :
Modern speech applications utilize acoustic models with billions of parameters, and serve millions of users. Storing an acoustic model for each user is costly. We show through the use of sparse regularization, that it is possible to obtain competitive adaptation performance by changing only a small fraction of the parameters of an acoustic model. This allows for the compression of speaker-dependent models: a capability that has important implications for systems with millions of users. We achieve a performance comparable to the best Maximum A Posteriori (MAP) adaptation models while only adapting 5% of the acoustic model parameters. Thus it is possible to compress the speaker dependent acoustic models by close to a factor of 20. The proposed sparse adaptation criterion improves three aspects of previous work: It combines ℓ0 and ℓ1 penalties, have different adaptation rates for mean and variance parameters and is invariant to affine transformations.
Keywords :
maximum likelihood estimation; speech processing; acoustic model; affine invariant sparse maximum a posteriori adaptation; competitive adaptation; sparse regularization; speaker dependent model compression; speech application; Acoustics; Adaptation models; Bayesian methods; Error analysis; Hidden Markov models; Speech recognition; Training; Bayesian prior; elastic net; non-smooth optimization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288874