• DocumentCode
    672347
  • Title

    Learning state labels for sparse classification of speech with matrix deconvolution

  • Author

    Hurmalainen, Antti ; Virtanen, Tuomas

  • Author_Institution
    Tampere Univ. of Technol., Tampere, Finland
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    168
  • Lastpage
    173
  • Abstract
    Non-negative spectral factorisation with long temporal context has been successfully used for noise robust recognition of speech in multi-source environments. Sparse classification from activations of speech atoms can be employed instead of conventional GMMs to determine speech state likelihoods. For accurate classification, correct linguistic state labels must be assigned to speech atoms. We propose using non-negative matrix deconvolution for learning the labels with algorithms closely matching a framework that separates speech from additive noises. Experiments on the 1st CHiME Challenge corpus show improvement in recognition accuracy over labels acquired from original atom sources or previously used least squares regression. The new approach also circumvents numerical issues encountered in previous learning methods, and opens up possibilities for new speech basis generation algorithms.
  • Keywords
    learning (artificial intelligence); least mean squares methods; matrix decomposition; regression analysis; speech recognition; CHiME challenge corpus; GMM; least squares regression; linguistic state labels; matrix deconvolution; multisource environments; noise robust speech recognition; nonnegative matrix deconvolution; nonnegative spectral factorisation; recognition accuracy; sparse speech classification; speech atoms; speech basis generation algorithms; speech state likelihoods; state label learning; temporal context; Hidden Markov models; Noise; Sparse matrices; Spectrogram; Speech; Speech recognition; Training; Automatic speech recognition; noise robustness; non-negative matrix factorization; sparse classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707724
  • Filename
    6707724