• DocumentCode
    2175810
  • Title

    Delta-spectral cepstral coefficients for robust speech recognition

  • Author

    Kumar, Kshitiz ; Kim, Chanwoo ; Stern, Richard M.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4784
  • Lastpage
    4787
  • Abstract
    Almost all current automatic speech recognition (ASR) systems conventionally append delta and double-delta cepstral features to static cepstral features. In this work we describe a modified feature-extraction procedure in which the time-difference operation is performed in the spectral domain, rather than the cepstral domain as is generally presently done. We argue that this approach based on "delta-spectral" features is needed because even though delta-cepstral features capture dynamic speech information and generally greatly improve ASR recognition accuracy, they are not robust to noise and reverberation. We support the validity of the delta-spectral approach both with observations about the modulation spectrum of speech and noise, and with objective experiments that document the benefit that the delta-spectral approach brings to a variety of currently popular feature extraction algorithms. We found that the use of delta-spectral features, rather than the more traditional delta-cepstral features, improves the effective SNR by between 5 and 8 dB for background music and white noise, and recognition accuracy in reverberant environments is improved as well.
  • Keywords
    feature extraction; speech recognition; ASR recognition; SNR; automatic speech recognition; delta-spectral cepstral coefficients; double-delta cepstral features; feature extraction algorithms; spectral domain; static cepstral features; time-difference operation; Accuracy; Mel frequency cepstral coefficient; Signal to noise ratio; Speech; Speech recognition; Speech recognition; denoising; dereverberation; speech analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947425
  • Filename
    5947425