An Extension of STFT Uncertainty Propagation for GMM-Based Super-Gaussian a Priori Models

Author

Fernandez Astudillo, Ramon

Author_Institution

Spoken Language Syst. Lab., INESC-IDLisboa, Lisbon, Portugal

Volume

20

Issue

12

fYear

2013

fDate

Dec. 2013

Firstpage

1163

Lastpage

1166

Abstract

Feature compensation is a low computational cost technique to achieve robust automatic speech recognition (ASR). Short-time Fourier Transform Uncertainty Propagation (STFT-UP) provides feature compensation in domains used for ASR as, e.g., Mel-Frequency Cepstra Coefficient (MFCC), while using STFT domain distortion models. However, STFT-UP is limited to Gaussian priors when modeling speech distortion, whereas super-Gaussian priors are known to provide improved performance. In this letter, an extension of STFT-UP is presented that uses approximate super-Gaussian priors. This is achieved by extending the conventional complex Gaussian priors to complex Gaussian mixture priors. The approach can be applied to any of the STFT-UP existing solutions, thus providing super-Gaussian uncertainty propagation. The method is exemplified by a Minimum Mean Square Error (MMSE) MFCC estimator with an approximate generalized Gamma speech prior. This estimator clearly outperforms the Gaussian-based MMSE-MFCC feature compensation on the AURORA4 corpus.

Keywords

Fourier transforms; Gaussian distribution; compensation; least mean squares methods; speech recognition; ASR; AURORA4 corpus; GMM; MFCC; MMSE; STFT uncertainty propagation; STFT-UP; automatic speech recognition; complex Gaussian mixture; feature compensation; gamma speech; mel-frequency cepstra coefficient; minimum mean square error; short-time Fourier transform uncertainty propagation; speech distortion; super-Gaussian a priori models; super-Gaussian uncertainty propagation; Computational modeling; Hidden Markov models; Mel frequency cepstral coefficient; Nonlinear distortion; Random variables; Speech; Uncertainty; MMSE; super-gaussian priors; uncertainty decoding; uncertainty propagation;

fLanguage

English

Journal_Title

Signal Processing Letters, IEEE

Publisher

ieee

ISSN

1070-9908

Type

jour

DOI

10.1109/LSP.2013.2283493

Filename

6609070