• DocumentCode
    3165358
  • Title

    Discriminative feature transforms using differenced maximum mutual information

  • Author

    Delcroix, Marc ; Ogawa, Atsunori ; Watanabe, Shinji ; Nakatani, Tomohiro ; Nakamura, Atsushi

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4753
  • Lastpage
    4756
  • Abstract
    Recently feature compensation techniques that train feature transforms using a discriminative criterion have attracted much interest in the speech recognition community. Typically, the acoustic feature space is modeled by a Gaussian mixture model (GMM), and a feature transform is assigned to each Gaussian of the GMM. Feature compensation is then performed by transforming features using the transformation associated with each Gaussian, then summing up the transformed features weighted by the posterior probability of each Gaussian. Several discriminative criteria have been investigated for estimating the feature transformation parameters including maximum mutual information (MMI) and minimum phone error (MPE). Recently, the differenced MMI (dMMI) criterion that generalizes MMI andMPE, has been shown to provide competitive performance for acoustic model training. In this paper, we investigate the use of the dMMI criterion for discriminative feature transforms and demonstrate in a noisy speech recognition experiment that dMMI achieves recognition performance superior to that of MMI or MPE.
  • Keywords
    Gaussian processes; probability; speech recognition; transforms; Gaussian mixture model; MMI; MPE; acoustic feature space; acoustic model training; differenced maximum mutual information; discriminative feature transforms; feature compensation; minimum phone error; noisy speech recognition; posterior probability; recognition performance; Acoustics; Hidden Markov models; Linear programming; Noise measurement; Speech recognition; Training; Transforms; Speech recognition; differenced MMI; discriminative feature transforms; discriminative training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288981
  • Filename
    6288981