DocumentCode
3163098
Title
Affine invariant sparse maximum a posteriori adaptation
Author
Olsen, Peder A. ; Huang, Jing ; Rennie, Steven J. ; Goel, Vaibhava
Author_Institution
T.J. Watson Res. Center, Dept. of Speech & Language Algorithms, IBM, Yorktown Heights, NY, USA
fYear
2012
fDate
25-30 March 2012
Firstpage
4317
Lastpage
4320
Abstract
Modern speech applications utilize acoustic models with billions of parameters, and serve millions of users. Storing an acoustic model for each user is costly. We show through the use of sparse regularization, that it is possible to obtain competitive adaptation performance by changing only a small fraction of the parameters of an acoustic model. This allows for the compression of speaker-dependent models: a capability that has important implications for systems with millions of users. We achieve a performance comparable to the best Maximum A Posteriori (MAP) adaptation models while only adapting 5% of the acoustic model parameters. Thus it is possible to compress the speaker dependent acoustic models by close to a factor of 20. The proposed sparse adaptation criterion improves three aspects of previous work: It combines ℓ0 and ℓ1 penalties, have different adaptation rates for mean and variance parameters and is invariant to affine transformations.
Keywords
maximum likelihood estimation; speech processing; acoustic model; affine invariant sparse maximum a posteriori adaptation; competitive adaptation; sparse regularization; speaker dependent model compression; speech application; Acoustics; Adaptation models; Bayesian methods; Error analysis; Hidden Markov models; Speech recognition; Training; Bayesian prior; elastic net; non-smooth optimization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location
Kyoto
ISSN
1520-6149
Print_ISBN
978-1-4673-0045-2
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2012.6288874
Filename
6288874
Link To Document