Efficient normalization based upon GPD [generalized probabilistic descent]

Author

Woudenberg, Eric ; Biem, Alain ; McDermott, Erik ; Katagiri, Slhigeru

Author_Institution

ATR Int., Kyoto, Japan

Volume

4

fYear

1997

fDate

21-24 Apr 1997

Firstpage

3245

Abstract

We propose a simple but powerful method for normalizing various sources of mismatch between training and testing conditions in speech recognizers, based on a training methodology called the generalized probabilistic descent method (GPD). In this new framework, a gradient based method is used to adapt the parameters of the feature extraction process in order to minimize the distortion between new speech data and existing classifier models, while most conventional normalization/adaptation methods attempt to adapt classification parameters. The GPD was proposed as a general discriminative training method for pattern recognizers such as neural networks. Up until now this has been used only for classifier design, sometimes in combination with the design of a non adaptive feature extractor. This paper, in contrast, studies the adaptive training benefits of GPD in the framework of normalizing the feature extractor to a new pattern environment. Experiments which use this technique to improve Japanese vowel classification were conducted and demonstrate the ability to reduce error rates by as much as 40%

Keywords

adaptive filters; adaptive signal processing; band-pass filters; feature extraction; filtering theory; pattern classification; probability; speech processing; speech recognition; GPD; HMM; Japanese vowel classification; adaptive filter banks; adaptive training; classification parameters; classifier models; distortion minimization; error rate reduction; experiments; feature extraction; general discriminative training method; generalized probabilistic descent method; gradient based method; neural networks; normalization; pattern recognizers; speech recognizers; testing conditions; training conditions; training methodology; Acoustic distortion; Feature extraction; Filter bank; Humans; Information processing; Loudspeakers; Maximum likelihood estimation; Speech processing; Speech recognition; Testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.595484

Filename

595484