Speech recognition of multiple accented English data using acoustic model interpolation

Author

Fraga-Silva, Thiago ; Gauvain, Jean-Luc ; Lamel, Lori

Author_Institution

Spoken Language Process. Group, LIMSI, Orsay, France

fYear

2014

fDate

1-5 Sept. 2014

Firstpage

1781

Lastpage

1785

Abstract

In a previous work [1], we have shown that model interpolation can be applied for acoustic model adaptation for a specific show. Compared to other approaches, this method has the advantage to be highly flexible, allowing rapid adaptation by simply reassigning the interpolation coefficients. In this work this approach is used for a multi-accented English broadcast news data recognition, which can be considered an arduous task due to the impact of accent variability on the recognition performance. The work described in [1] is extended in two ways. First, in order to reduce the parameters of the interpolated model, a theoretically motivated EM-like mixture reduction algorithm is proposed. Second, beyond supervised adaptation, model interpolation is used as an unsupervised adaptation framework, where the interpolation coefficients are estimated on-the-fly for each test segment.

Keywords

expectation-maximisation algorithm; interpolation; speech recognition; unsupervised learning; EM-like mixture reduction algorithm; English broadcast news data recognition; acoustic model interpolation; multiple accented English data; speech recognition; unsupervised adaptation framework; Acoustics; Adaptation models; Hidden Markov models; Interpolation; Speech; Speech recognition; Training; Model interpolation; multi-accented data; supervised and unsupervised adaptation;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European

Conference_Location

Lisbon

Type

conf

Filename

6952656