Title :
Sparse Maximum A Posteriori adaptation
Author :
Olsen, Peder A. ; Huang, Jing ; Goel, Vaibhava ; Rennie, Steven J.
Author_Institution :
Dept. of Speech & Language Algorithms, IBM, Yorktown Heights, NY, USA
Abstract :
Maximum A Posteriori (MAP) adaptation is a powerful tool for building speaker specific acoustic models. Modern speech applications utilize acoustic models with millions of parameters, and serve millions of users. Storing an acoustic model for each user in such settings is costly. However, speaker specific acoustic models are generally similar to the acoustic model being adapted. By imposing sparseness constraints, we can save significantly on storage, and even improve the quality of the resulting speaker-dependent model. In this paper we utilize the ℓ1 or ℓ0 norm as a regularizer to induce sparsity. We show that we can obtain up to 95% sparsity with negligible loss in recognition accuracy, with both penalties. By removing small differences, which constitute “adaptation noise”, sparse MAP is actually able to improve upon MAP adaptation. Sparse MAP reduces the MAP word error rate by 2% relative at 89% sparsity.
Keywords :
maximum likelihood estimation; speaker recognition; adaptation noise; modern speech applications; recognition accuracy; sparse maximum a posteriori adaptation; sparseness constraints; speaker specific acoustic models; speaker-dependent model; word error rate reduction; Acoustics; Adaptation models; Bayesian methods; Computational modeling; Data models; Error analysis; Hidden Markov models;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163905