Title :
Discriminative training of auditory filters of different shapes for robust speech recognition
Author :
Mak, Brian ; Tam, Yik-Cheung ; Hsiao, Roger
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., China
Abstract :
The bank-of-filters spectrum analysis model is commonly used in the extraction of acoustic features for automatic speech recognition. The most critical component in the analysis model is a bank of bandpass filters. We studied a data-driven approach to designing a bank of "optimal" filters of various shapes discriminatively so that the recognition error of a task is minimized. Three different shapes of varying degree of constraints were investigated: (1) parametric Gaussian filters; (2) non-parametric but constrained triangular-like filters; and (3) non-parametric and unconstrained free-formed filters. Filters were trained to derive the new robust auditory features proposed by the Bell Labs. In addition, both the filters (and thus the ensuing acoustic features) and the acoustic model parameters were discriminatively trained. The major result is that our proposed triangular-like filters perform at least as well as the free-formed filters and perform better than the Gaussian filters.
Keywords :
Gaussian processes; channel bank filters; circuit optimisation; feature extraction; filtering theory; network synthesis; spectral analysis; speech recognition; Bell Labs; acoustic features extraction; acoustic model parameters; auditory filters; automatic speech recognition; bandpass filters; bank-of-filters spectrum analysis model; data-driven approach; discriminative training; nonparametric constrained triangular-like filters; nonparametric free-formed filters; optimal filters design; parametric Gaussian filters; recognition error; robust auditory features; robust speech recognition; unconstrained free-formed filters; Automatic speech recognition; Band pass filters; Feature extraction; Filtering; Nonlinear filters; Psychoacoustic models; Robustness; Shape; Speech analysis; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1202290