• DocumentCode
    1475311
  • Title

    A Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Model for Noisy Speech Recognition

  • Author

    Du, Jun ; Huo, Qiang

  • Author_Institution
    Visual Comput. Group, Microsoft Res. Asia (MSRA), Beijing, China
  • Volume
    19
  • Issue
    8
  • fYear
    2011
  • Firstpage
    2285
  • Lastpage
    2293
  • Abstract
    This paper presents a new feature compensation approach to noisy speech recognition by using high-order vector Taylor series (HOVTS) approximation of an explicit model of environmental distortions. Formulations for maximum-likelihood (ML) estimation of both additive noises and convolutional distortions, and minimum mean squared error (MMSE) estimation of clean speech are derived. Experimental results on Aurora2 and Aurora4 benchmark databases, where the modeling assumption of the distortion model is more accurate, demonstrate that the standard HOVTS-based feature compensation approaches achieve consistently significant improvement in recognition accuracy compared to traditional standard first-order VTS-based approach. For a real-world in-vehicle connected digits recognition task on Aurora3 benchmark database where the modeling assumption of the distortion model is less accurate, modifications are necessary to make VTS-based feature compensation approaches work. In this case, the second-order VTS-based approach performs only slightly better than the first-order VTS-based approach.
  • Keywords
    compensation; least mean squares methods; maximum likelihood estimation; speech recognition; Aurora2 benchmark database; Aurora3 benchmark database; Aurora4 benchmark database; HOVTS approximation; ML estimation; MMSE estimation; additive noises; convolutional distortions; environmental distortions; explicit distortion model; feature compensation approach; first-order VTS-based approach; high-order vector Taylor series approximation; invehicle connected digit recognition task; maximum likelihood estimation; minimum mean squared error estimation; noisy speech recognition; Approximation methods; Cepstral analysis; Hidden Markov models; Nonlinear distortion; Signal to noise ratio; Speech; Speech recognition; Distortion model; feature compensation; noise robustness; robust speech recognition; vector Taylor series (VTS);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2129508
  • Filename
    5734800