DocumentCode
1475311
Title
A Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Model for Noisy Speech Recognition
Author
Du, Jun ; Huo, Qiang
Author_Institution
Visual Comput. Group, Microsoft Res. Asia (MSRA), Beijing, China
Volume
19
Issue
8
fYear
2011
Firstpage
2285
Lastpage
2293
Abstract
This paper presents a new feature compensation approach to noisy speech recognition by using high-order vector Taylor series (HOVTS) approximation of an explicit model of environmental distortions. Formulations for maximum-likelihood (ML) estimation of both additive noises and convolutional distortions, and minimum mean squared error (MMSE) estimation of clean speech are derived. Experimental results on Aurora2 and Aurora4 benchmark databases, where the modeling assumption of the distortion model is more accurate, demonstrate that the standard HOVTS-based feature compensation approaches achieve consistently significant improvement in recognition accuracy compared to traditional standard first-order VTS-based approach. For a real-world in-vehicle connected digits recognition task on Aurora3 benchmark database where the modeling assumption of the distortion model is less accurate, modifications are necessary to make VTS-based feature compensation approaches work. In this case, the second-order VTS-based approach performs only slightly better than the first-order VTS-based approach.
Keywords
compensation; least mean squares methods; maximum likelihood estimation; speech recognition; Aurora2 benchmark database; Aurora3 benchmark database; Aurora4 benchmark database; HOVTS approximation; ML estimation; MMSE estimation; additive noises; convolutional distortions; environmental distortions; explicit distortion model; feature compensation approach; first-order VTS-based approach; high-order vector Taylor series approximation; invehicle connected digit recognition task; maximum likelihood estimation; minimum mean squared error estimation; noisy speech recognition; Approximation methods; Cepstral analysis; Hidden Markov models; Nonlinear distortion; Signal to noise ratio; Speech; Speech recognition; Distortion model; feature compensation; noise robustness; robust speech recognition; vector Taylor series (VTS);
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2011.2129508
Filename
5734800
Link To Document