• DocumentCode
    25236
  • Title

    An Extension of STFT Uncertainty Propagation for GMM-Based Super-Gaussian a Priori Models

  • Author

    Fernandez Astudillo, Ramon

  • Author_Institution
    Spoken Language Syst. Lab., INESC-IDLisboa, Lisbon, Portugal
  • Volume
    20
  • Issue
    12
  • fYear
    2013
  • fDate
    Dec. 2013
  • Firstpage
    1163
  • Lastpage
    1166
  • Abstract
    Feature compensation is a low computational cost technique to achieve robust automatic speech recognition (ASR). Short-time Fourier Transform Uncertainty Propagation (STFT-UP) provides feature compensation in domains used for ASR as, e.g., Mel-Frequency Cepstra Coefficient (MFCC), while using STFT domain distortion models. However, STFT-UP is limited to Gaussian priors when modeling speech distortion, whereas super-Gaussian priors are known to provide improved performance. In this letter, an extension of STFT-UP is presented that uses approximate super-Gaussian priors. This is achieved by extending the conventional complex Gaussian priors to complex Gaussian mixture priors. The approach can be applied to any of the STFT-UP existing solutions, thus providing super-Gaussian uncertainty propagation. The method is exemplified by a Minimum Mean Square Error (MMSE) MFCC estimator with an approximate generalized Gamma speech prior. This estimator clearly outperforms the Gaussian-based MMSE-MFCC feature compensation on the AURORA4 corpus.
  • Keywords
    Fourier transforms; Gaussian distribution; compensation; least mean squares methods; speech recognition; ASR; AURORA4 corpus; GMM; MFCC; MMSE; STFT uncertainty propagation; STFT-UP; automatic speech recognition; complex Gaussian mixture; feature compensation; gamma speech; mel-frequency cepstra coefficient; minimum mean square error; short-time Fourier transform uncertainty propagation; speech distortion; super-Gaussian a priori models; super-Gaussian uncertainty propagation; Computational modeling; Hidden Markov models; Mel frequency cepstral coefficient; Nonlinear distortion; Random variables; Speech; Uncertainty; MMSE; super-gaussian priors; uncertainty decoding; uncertainty propagation;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2013.2283493
  • Filename
    6609070