DocumentCode
310459
Title
Efficient normalization based upon GPD [generalized probabilistic descent]
Author
Woudenberg, Eric ; Biem, Alain ; McDermott, Erik ; Katagiri, Slhigeru
Author_Institution
ATR Int., Kyoto, Japan
Volume
4
fYear
1997
fDate
21-24 Apr 1997
Firstpage
3245
Abstract
We propose a simple but powerful method for normalizing various sources of mismatch between training and testing conditions in speech recognizers, based on a training methodology called the generalized probabilistic descent method (GPD). In this new framework, a gradient based method is used to adapt the parameters of the feature extraction process in order to minimize the distortion between new speech data and existing classifier models, while most conventional normalization/adaptation methods attempt to adapt classification parameters. The GPD was proposed as a general discriminative training method for pattern recognizers such as neural networks. Up until now this has been used only for classifier design, sometimes in combination with the design of a non adaptive feature extractor. This paper, in contrast, studies the adaptive training benefits of GPD in the framework of normalizing the feature extractor to a new pattern environment. Experiments which use this technique to improve Japanese vowel classification were conducted and demonstrate the ability to reduce error rates by as much as 40%
Keywords
adaptive filters; adaptive signal processing; band-pass filters; feature extraction; filtering theory; pattern classification; probability; speech processing; speech recognition; GPD; HMM; Japanese vowel classification; adaptive filter banks; adaptive training; classification parameters; classifier models; distortion minimization; error rate reduction; experiments; feature extraction; general discriminative training method; generalized probabilistic descent method; gradient based method; neural networks; normalization; pattern recognizers; speech recognizers; testing conditions; training conditions; training methodology; Acoustic distortion; Feature extraction; Filter bank; Humans; Information processing; Loudspeakers; Maximum likelihood estimation; Speech processing; Speech recognition; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location
Munich
ISSN
1520-6149
Print_ISBN
0-8186-7919-0
Type
conf
DOI
10.1109/ICASSP.1997.595484
Filename
595484
Link To Document