Learning state labels for sparse classification of speech with matrix deconvolution

Author

Hurmalainen, Antti ; Virtanen, Tuomas

Author_Institution

Tampere Univ. of Technol., Tampere, Finland

fYear

2013

fDate

8-12 Dec. 2013

Firstpage

168

Lastpage

173

Abstract

Non-negative spectral factorisation with long temporal context has been successfully used for noise robust recognition of speech in multi-source environments. Sparse classification from activations of speech atoms can be employed instead of conventional GMMs to determine speech state likelihoods. For accurate classification, correct linguistic state labels must be assigned to speech atoms. We propose using non-negative matrix deconvolution for learning the labels with algorithms closely matching a framework that separates speech from additive noises. Experiments on the 1st CHiME Challenge corpus show improvement in recognition accuracy over labels acquired from original atom sources or previously used least squares regression. The new approach also circumvents numerical issues encountered in previous learning methods, and opens up possibilities for new speech basis generation algorithms.

Keywords

learning (artificial intelligence); least mean squares methods; matrix decomposition; regression analysis; speech recognition; CHiME challenge corpus; GMM; least squares regression; linguistic state labels; matrix deconvolution; multisource environments; noise robust speech recognition; nonnegative matrix deconvolution; nonnegative spectral factorisation; recognition accuracy; sparse speech classification; speech atoms; speech basis generation algorithms; speech state likelihoods; state label learning; temporal context; Hidden Markov models; Noise; Sparse matrices; Spectrogram; Speech; Speech recognition; Training; Automatic speech recognition; noise robustness; non-negative matrix factorization; sparse classification;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location

Olomouc

Type

conf

DOI

10.1109/ASRU.2013.6707724

Filename

6707724