DocumentCode :
3647
Title :
Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems
Author :
Siniscalchi, Sabato Marco ; Jinyu Li ; Chin-Hui Lee
Author_Institution :
Dept. of Comput. Eng., Kore Univ. of Enna, Enna, Italy
Volume :
21
Issue :
10
fYear :
2013
fDate :
Oct. 2013
Firstpage :
2152
Lastpage :
2161
Abstract :
Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any automatic speech recognition (ASR) system. This work addresses the problem of increased degradation in performance when moving from speaker-dependent (SD) to speaker-independent (SI) conditions for connectionist (or hybrid) hidden Markov model/artificial neural network (HMM/ANN) systems in the context of large vocabulary continuous speech recognition (LVCSR). Adapting hybrid HMM/ANN systems on a small amount of adaptation data has been proven to be a difficult task, and has been a limiting factor in the widespread deployment of hybrid techniques in operational ASR systems. Addressing the crucial issue of speaker adaptation (SA) for hybrid HMM/ANN system can thereby have a great impact on the connectionist paradigm, which will play a major role in the design of next-generation LVCSR considering the great success reported by deep neural networks - ANNs with many hidden layers that adopts the pre-training technique - on many speech tasks. Current adaptation techniques for ANNs based on injecting an adaptable linear transformation network connected to either the input, or the output layer are not effective especially with a small amount of adaptation data, e.g., a single adaptation utterance. In this paper, a novel solution is proposed to overcome those limits and make it robust to scarce adaptation resources. The key idea is to adapt the hidden activation functions rather than the network weights. The adoption of Hermitian activation functions makes this possible. Experimental results on an LVCSR task demonstrate the effectiveness of the proposed approach.
Keywords :
hidden Markov models; next generation networks; polynomials; speech recognition; ANN; ASR; Hermitian activation functions; Hermitian polynomial; adaptable linear transformation network; adaptation techniques; artificial neural network; automatic speech recognition; connectionist speech recognition systems; hidden Markov model; hybrid HMM-ANN systems; large vocabulary continuous speech recognition; model adaptation techniques; neural networks; next-generation LVCSR; operational ASR systems; speaker adaptation; speaker-dependent conditions; speaker-independent conditions; Artificial neural networks; model adaptation; speech processing;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2013.2270370
Filename :
6544616
Link To Document :
بازگشت