• DocumentCode
    2790483
  • Title

    Discrimination of speech and non-linguistic vocalizations by Non-Negative Matrix Factorization

  • Author

    Schuller, Björn ; Weninger, Felix

  • Author_Institution
    Inst. for Human-Machine Commun., Tech. Univ. Munchen, München, Germany
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5054
  • Lastpage
    5057
  • Abstract
    We introduce features based on Non-Negative Matrix Factorization (NMF) for discrimination of speech and non-linguistic vocalizations such as laughter or breathing, which is a crucial task in recognition of spontaneous speech. NMF has been successfully used in speech-related tasks such as de-noising and speaker separation. While existing approaches use it as a preprocessing step for conventional speech recognizers, we aim at directly classifying the output of the NMF algorithm. To this end, we propose a feature extraction procedure based on a supervised variant of NMF, considering two different algorithms. Applying our approach to a spontaneous speech corpus, we show that addition of NMF features to an MFCC-based classifier increases mean recall of speech and non-linguistic vocalizations by over 2.5% absolute, and particularly recall of laughter by 6.6% absolute. The improvement is significant at a level of 0.4 %.
  • Keywords
    feature extraction; linguistics; matrix decomposition; signal classification; speech recognition; MFCC-based classifier; NMF algorithm; breathing; feature extraction; laughter; mean recall; nonlinguistic vocalization; nonnegative matrix factorization; speech discrimination; speech recognition; spontaneous speech corpus; Acoustic measurements; Feature extraction; Man machine systems; Noise reduction; Performance evaluation; Signal processing; Signal processing algorithms; Spectrogram; Speech processing; Speech recognition; Non-Negative Matrix Factorization; Non-linguistic vocalizations; Speech recognition; Spontaneous speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495061
  • Filename
    5495061