• DocumentCode
    807626
  • Title

    A comparison of signal processing front ends for automatic word recognition

  • Author

    Jankowski, Charles R., Jr. ; Vo, Hoang-Doan H. ; Lippmann, Richard P.

  • Author_Institution
    Lincoln Lab., MIT, Lexington, MA, USA
  • Volume
    3
  • Issue
    4
  • fYear
    1995
  • fDate
    7/1/1995 12:00:00 AM
  • Firstpage
    286
  • Lastpage
    293
  • Abstract
    This paper compares the word error rate of a speech recognizer using several signal processing front ends based on auditory properties. Front ends were compared with a control mel filter bank (MFB) based cepstral front end in clean speech and with speech degraded by noise and spectral variability, using the TI-105 isolated word database. MFB recognition error rates ranged from 0.5 to 26.9% in noise, depending on the SNR, and auditory models provided error rates as much as four percentage points lower. With speech degraded by linear filtering, MFB error rates ranged from 0.5 to 3.1%, and the reduction in error rates provided by auditory models was less than 0.5 percentage points. Some earlier studies that demonstrated considerably more improvement with auditory models used linear predictive coding (LPC) based control front ends. This paper shows that MFB cepstra significantly outperform LPC cepstra under noisy conditions. Techniques using an optimal linear combination of features for data reduction were also evaluated
  • Keywords
    cepstral analysis; error statistics; filters; linear predictive coding; signal processing; speech intelligibility; speech recognition; MFB recognition error rates; TI-105 isolated word database; auditory models; auditory properties; automatic word recognition; clean speech; control mel filter bank based cepstral front end; data reduction; degraded speech; linear filtering; linear predictive coding; noise; signal processing front ends; spectral variability; speech recognizer; word error rate; Automatic control; Automatic speech recognition; Degradation; Error analysis; Filter bank; Linear predictive coding; Signal processing; Speech enhancement; Speech processing; Speech recognition;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.397093
  • Filename
    397093