• DocumentCode
    2971103
  • Title

    Discriminative adaptive training with VTS and JUD

  • Author

    Flego, F. ; Gales, M.J.F.

  • Author_Institution
    Eng. Dept., Cambridge Univ., Cambridge, UK
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    170
  • Lastpage
    175
  • Abstract
    Adaptive training is a powerful approach for building speech recognition systems on non-homogeneous training data. Recently approaches based on predictive model-based compensation schemes, such as joint uncertainty decoding (JUD) and vector Taylor series (VTS), have been proposed. This paper reviews these model-based compensation schemes and relates them to factor-analysis style systems. Forms of maximum likelihood (ML) adaptive training with these approaches are described, based on both second-order optimisation schemes and expectation maximisation (EM). However, discriminative training is used in many state-of-the-art speech recognition. Hence, this paper proposes discriminative adaptive training with predictive model-compensation approaches for noise robust speech recognition. This training approach is applied to both JUD and VTS compensation with minimum phone error training. A large scale multi-environment training configuration is used and the systems evaluated on a range of in-car collected data tasks.
  • Keywords
    expectation-maximisation algorithm; optimisation; speech coding; speech recognition; JUD compensation; VTS compensation; discriminative adaptive training; expectation maximisation; factor-analysis style systems; joint uncertainty decoding; large scale multienvironment training configuration; maximum likelihood adaptive training; minimum phone error training; nonhomogeneous training data; predictive model-based compensation schemes; second-order optimisation schemes; speech recognition systems; vector Taylor series; Acoustic noise; Background noise; Maximum likelihood estimation; Maximum likelihood linear regression; Parameter estimation; Power system modeling; Predictive models; Speech recognition; Training data; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373266
  • Filename
    5373266