Title :
Robust speech recognition based on structured modeling, irrelevant variability normalization and unsupervised online adaptation
Author :
Huo, Qiang ; Zhu, Donglai
Author_Institution :
Microsoft Res. Asia, Beijing
Abstract :
We present a new approach to robust speech recognition based on structured modeling, irrelevant variability normalization (IVN) and unsupervised online adaptation (OLA). In offline training stage, a set of generic HMMs for basic speech units relevant to phonetic classification is trained along with several sets of feature transforms with different degrees of freedom by using a maximum likelihood (ML) IVN-based training strategy. In recognition stage, after a first-pass recognition, the most appropriate set of feature transforms is identified and adapted under ML criterion by using the unknown utterance itself, which is recognized again to achieve better performance by using the adapted feature transforms and the pre-trained generic HMMs. The effectiveness of the proposed approach is confirmed by evaluation experiments on Finnish Aurora3 database.
Keywords :
hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; Finnish Aurora3 database; IVN-based training strategy; feature transforms; generic HMM; irrelevant variability normalization; maximum likelihood; phonetic classification; robust speech recognition; structured modeling; unsupervised online adaptation; Asia; Automatic speech recognition; Decoding; Gaussian processes; Hidden Markov models; Labeling; Robustness; Spatial databases; Speech recognition; Training data; feature transformation; irrelevant variability normalization; online adaptation; robust speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960664