DocumentCode :
395192
Title :
Discriminative training of auditory filters of different shapes for robust speech recognition
Author :
Mak, Brian ; Tam, Yik-Cheung ; Hsiao, Roger
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., China
Volume :
2
fYear :
2003
fDate :
6-10 April 2003
Abstract :
The bank-of-filters spectrum analysis model is commonly used in the extraction of acoustic features for automatic speech recognition. The most critical component in the analysis model is a bank of bandpass filters. We studied a data-driven approach to designing a bank of "optimal" filters of various shapes discriminatively so that the recognition error of a task is minimized. Three different shapes of varying degree of constraints were investigated: (1) parametric Gaussian filters; (2) non-parametric but constrained triangular-like filters; and (3) non-parametric and unconstrained free-formed filters. Filters were trained to derive the new robust auditory features proposed by the Bell Labs. In addition, both the filters (and thus the ensuing acoustic features) and the acoustic model parameters were discriminatively trained. The major result is that our proposed triangular-like filters perform at least as well as the free-formed filters and perform better than the Gaussian filters.
Keywords :
Gaussian processes; channel bank filters; circuit optimisation; feature extraction; filtering theory; network synthesis; spectral analysis; speech recognition; Bell Labs; acoustic features extraction; acoustic model parameters; auditory filters; automatic speech recognition; bandpass filters; bank-of-filters spectrum analysis model; data-driven approach; discriminative training; nonparametric constrained triangular-like filters; nonparametric free-formed filters; optimal filters design; parametric Gaussian filters; recognition error; robust auditory features; robust speech recognition; unconstrained free-formed filters; Automatic speech recognition; Band pass filters; Feature extraction; Filtering; Nonlinear filters; Psychoacoustic models; Robustness; Shape; Speech analysis; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1202290
Filename :
1202290
Link To Document :
بازگشت