• DocumentCode
    1031929
  • Title

    Constructing Modulation Frequency Domain-Based Features for Robust Speech Recognition

  • Author

    Hung, Jeih-weih ; Tsai, Wei-Yi

  • Author_Institution
    Nat. Chi Nan Univ., Nantou
  • Volume
    16
  • Issue
    3
  • fYear
    2008
  • fDate
    3/1/2008 12:00:00 AM
  • Firstpage
    563
  • Lastpage
    577
  • Abstract
    Data-driven temporal filtering approaches based on a specific optimization technique have been shown to be capable of enhancing the discrimination and robustness of speech features in speech recognition. The filters in these approaches are often obtained with the statistics of the features in the temporal domain. In this paper, we derive new data-driven temporal filters that employ the statistics of the modulation spectra of the speech features. Three new temporal filtering approaches are proposed and based on constrained versions of linear discriminant analysis (LDA), principal component analysis (PCA), and minimum class distance (MCD), respectively. It is shown that these proposed temporal filters can effectively improve the speech recognition accuracy in various noise-corrupted environments. In experiments conducted on Test Set A of the Aurora-2 noisy digits database, these new temporal filters, together with cepstral mean and variance normalization (CMVN), provide average relative error reduction rates of over 40% and 27% when compared with baseline Mel frequency cepstral coefficient (MFCC) processing and CMVN alone, respectively.
  • Keywords
    principal component analysis; speech recognition; Aurora-2 noisy digits database; Mel frequency cepstral coefficient processing; cepstral mean and variance normalization; constructing modulation frequency domain-based features; data-driven temporal filtering; linear discriminant analysis; principal component analysis; robust speech recognition; Modulation frequency; noise-robust features; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2007.913405
  • Filename
    4429197