مرکز منطقه ای اطلاع رساني علوم و فناوري - Fast speaker adaptation of artificial neural networks for automatic speech recognition

DocumentCode :

2325503

Title :

Fast speaker adaptation of artificial neural networks for automatic speech recognition

Author :

Dupont, Stéphane ; Cheboub, Leila

Author_Institution :

TCTS-MULTITEL, Faculte Polytech. de Mons, Belgium

Volume :

fYear :

2000

fDate :

2000

Firstpage :

1795

Abstract :

This paper presents a fast speaker adaptation technique dedicated to automatic speech recognition systems using artificial neural networks (ANNs) for hidden Markov models (HMMs) state probability estimation. Speaker-adapted ANNs are first obtained from the training data using affine transformations in the feature space. Similarly to the “eigenvoice” approach, principal components analysis (PCA) is then applied to these transformation matrices. The first few eigenvectors represent a small-dimensional space which captures most of the inter-speaker variability of the training set. During operation, these eigenvectors can be used to constrain the optimization of the transformation matrices for the new speakers. This optimization is performed using steepest descent with gradients obtained using backpropagation through the speaker independent ANN. We have been using state-of-the-art hybrid HMM/ANN systems trained on the Phonebook database. Supervised adaptation experiments with different amounts of data show better performance of this new technique compared to standard linear regression in the feature space: with only 20 words of adaptation data, results show a 15% relative decrease of the word error rate

Keywords :

backpropagation; eigenvalues and eigenfunctions; estimation theory; hidden Markov models; neural nets; optimisation; principal component analysis; probability; speech recognition; HMMs state probability estimation; Phonebook database; affine transformations; artificial neural networks; automatic speech recognition; backpropagation; eigenvectors; eigenvoice; fast speaker adaptation; gradients; hidden Markov models; inter-speaker variability; optimization; performance; principal components analysis; speaker independent ANN; steepest descent; training data; training set; transformation matrices; word error rate; Artificial neural networks; Automatic speech recognition; Backpropagation; Constraint optimization; Hidden Markov models; Linear regression; Principal component analysis; Spatial databases; State estimation; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location :

Istanbul

ISSN :

1520-6149

Print_ISBN :

0-7803-6293-4

Type :

conf

DOI :

10.1109/ICASSP.2000.862102

Filename :

862102

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2325503