• DocumentCode
    2972054
  • Title

    Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition

  • Author

    Xu, Haitian ; Gales, Mark J F ; Chin, K.K.

  • Author_Institution
    Cambridge Res. Lab., Toshiba Res. Eur. Ltd., Cambridge, UK
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    222
  • Lastpage
    227
  • Abstract
    Model-based noise compensation techniques, such as vector Taylor series (VTS) compensation, have been applied to a range of noise robustness tasks. However one of the issues with these forms of approach is that for large speech recognition systems they are computationally expensive. To address this problem schemes such as Joint Uncertainty Decoding (JUD) have been proposed. Though computationally more efficient, the performance of the system is typically degraded. This paper proposes an alternative scheme, related to JUD, but making fewer approximations, VTS-JUD. Unfortunately this approach also removes some of the computational advantages of JUD. To address this, rather than using VTS-JUD directly, it is used instead to obtain statistics to estimate a predictive linear transform, PCMLLR. This is both computationally efficient and limits some of the issues associated with the diagonal covariance matrices typically used with schemes such as VTS. PCMLLR can also be simply used within an adaptive training framework (PAT). The performance of the VTS-JUD, PCMLLR and PAT system were compared to a number of standard approaches on an in-car speech recognition task. The proposed scheme is an attractive alternative to existing approaches.
  • Keywords
    covariance matrices; decoding; speech recognition; adaptive training framework; diagonal covariance matrices; in-car speech recognition task; joint uncertainty decoding performance; model-based noise compensation techniques; noise robust speech recognition; predictive linear transform; vector Taylor series compensation; Acoustic noise; Computational efficiency; Decoding; Degradation; Hidden Markov models; Noise robustness; Speech recognition; Taylor series; Uncertainty; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373317
  • Filename
    5373317