DocumentCode :
112676
Title :
Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR
Author :
Tran, Dung T. ; Vincent, Emmanuel ; Jouvet, Denis
Author_Institution :
Inria, Villers-les-Nancy, France
Volume :
23
Issue :
11
fYear :
2015
fDate :
Nov. 2015
Firstpage :
1835
Lastpage :
1846
Abstract :
We consider the framework of uncertainty propagation for automatic speech recognition (ASR) in highly nonstationary noise environments. Uncertainty is considered as the variance of speech distortion. Yet, its accurate estimation in the spectral domain and its propagation to the feature domain remain difficult. Existing methods typically rely on a single uncertainty estimator and propagator fixed by mathematical approximation. In this paper, we propose a new paradigm where we seek to learn more powerful mappings to predict uncertainty from data. We investigate two such possible mappings: linear fusion of multiple uncertainty estimators/propagators and nonparametric uncertainty estimation/propagation. In addition, a procedure to propagate the estimated spectral-domain uncertainty to the static Mel frequency cepstral coefficients (MFCCs), to the log-energy, and to their first- and second-order time derivatives is proposed. This results in a full uncertainty covariance matrix over both static and dynamic MFCCs. Experimental evaluation on Tracks 1 and 2 of the 2nd CHiME Challenge resulted in up to 29% and 28% relative keyword error rate reduction with respect to speech enhancement alone.
Keywords :
covariance matrices; feature extraction; nonparametric statistics; signal denoising; spectral-domain analysis; speech recognition; automatic speech recognition; covariance matrix; dynamic MFCC; feature domain; first-order time derivatives; highly nonstationary noise environment; keyword error rate reduction; mathematical approximation; mel frequency cepstral coefficients; multiple uncertainty estimator linear fusion; multiple uncertainty propagators linear fusion; noise robust ASR; nonparametric uncertainty estimation; nonparametric uncertainty propagation; second-order time derivatives; spectral domain; spectral-domain uncertainty estimation; speech distortion; static MFCC; Covariance matrices; Decoding; Estimation; Noise; Spectral analysis; Speech; Uncertainty; Nonparametric estimation; robust speech recognition; uncertainty decoding; uncertainty estimation;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2015.2450497
Filename :
7138603
Link To Document :
بازگشت