• DocumentCode
    2838398
  • Title

    Data-driven temporal filters based on maximum mutual information for robust features in speech recognition

  • Author

    Huang, Yung-Sheng ; Hung, Jeih-weih

  • Author_Institution
    Dept of Electr. Eng., Nat. Chi-Nan Univ., Taipei, Taiwan
  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    105
  • Lastpage
    108
  • Abstract
    Linear discriminant analysis (LDA), principal component analysis (PCA) and minimum classification error (MCE) have been used to derive data-driven temporal filters in order to improve the robustness of speech features for speech recognition. In this paper, the criterion of maximum mutual information (MMI) is proposed for constructing the temporal filters, and detailed comparative analysis among these various approaches are presented and discussed. Experimental results show that the MMI-derived temporal filters significantly improve the recognition performance of the original mel frequency cepstrum coefficients (MFCC) features compared to LDA/PCA/MCE-derived filters. Also, while the MMI-derived filters are combined with the conventional temporal filters, cepstral mean and variance normalization (CMVN), the recognition performance can be further improved.
  • Keywords
    FIR filters; cepstral analysis; feature extraction; frequency estimation; minimisation; principal component analysis; speech recognition; CMVN; LDA; MCE; MFCC features; MMI; PCA; cepstral mean and variance normalization; data-driven temporal filters; linear discriminant analysis; maximum mutual information; mel frequency cepstrum coefficients; minimum classification error; principal component analysis; recognition performance; robust features; speech recognition; Cepstral analysis; Information filtering; Information filters; Linear discriminant analysis; Mel frequency cepstral coefficient; Mutual information; Nonlinear filters; Principal component analysis; Robustness; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409597
  • Filename
    1409597