DocumentCode
672347
Title
Learning state labels for sparse classification of speech with matrix deconvolution
Author
Hurmalainen, Antti ; Virtanen, Tuomas
Author_Institution
Tampere Univ. of Technol., Tampere, Finland
fYear
2013
fDate
8-12 Dec. 2013
Firstpage
168
Lastpage
173
Abstract
Non-negative spectral factorisation with long temporal context has been successfully used for noise robust recognition of speech in multi-source environments. Sparse classification from activations of speech atoms can be employed instead of conventional GMMs to determine speech state likelihoods. For accurate classification, correct linguistic state labels must be assigned to speech atoms. We propose using non-negative matrix deconvolution for learning the labels with algorithms closely matching a framework that separates speech from additive noises. Experiments on the 1st CHiME Challenge corpus show improvement in recognition accuracy over labels acquired from original atom sources or previously used least squares regression. The new approach also circumvents numerical issues encountered in previous learning methods, and opens up possibilities for new speech basis generation algorithms.
Keywords
learning (artificial intelligence); least mean squares methods; matrix decomposition; regression analysis; speech recognition; CHiME challenge corpus; GMM; least squares regression; linguistic state labels; matrix deconvolution; multisource environments; noise robust speech recognition; nonnegative matrix deconvolution; nonnegative spectral factorisation; recognition accuracy; sparse speech classification; speech atoms; speech basis generation algorithms; speech state likelihoods; state label learning; temporal context; Hidden Markov models; Noise; Sparse matrices; Spectrogram; Speech; Speech recognition; Training; Automatic speech recognition; noise robustness; non-negative matrix factorization; sparse classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location
Olomouc
Type
conf
DOI
10.1109/ASRU.2013.6707724
Filename
6707724
Link To Document