DocumentCode :
1688202
Title :
Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition
Author :
Yoshioka, Takashi ; Nakatani, Takeshi
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Keihanna Science City, Japan
fYear :
2013
Firstpage :
7058
Lastpage :
7062
Abstract :
This paper considers using the feature enhancement approach for automatic recognition of speech corrupted by severely nonstationary noise, caused for example by interfering talkers and inter-frame distortion induced by reverberation. In particular, we focus on the issue of feature-domain noise model estimation and investigate a recently proposed approach, called noise model transfer (NMT), for estimating the rapidly changing noise model parameter values. Based on the fact that noise spectral changes can be detected more easily in the power spectrum domain than in the feature domain, NMT estimates the noise model parameter values for each time frame by using both observed feature vectors and noise power spectral estimates, on the assumption that a separate noise power spectrum estimator is available. This is achieved by finding the best transformation that maps the power spectra onto the noise model parameter space in the maximum likelihood sense. Whereas the transformation was previously modeled using a bias vector, this paper employs a more flexible affine transformation model. The results of 20,000-word reverberant speech recognition experiments show the advantage of the affine transformation model.
Keywords :
distortion; maximum likelihood estimation; reverberation; spectral analysis; speech enhancement; speech recognition; vectors; NMT; affine transformation model; automatic speech recognition; bias vector; feature enhancement approach; feature-domain noise model estimation; inter-frame distortion; maximum likelihood sense; noise model parameter value estimation; noise model transfer; noise power spectrum estimator; noise spectral change detection; nonstationary noise; power spectrum domain; reverberation; vocabulary reverberant speech recognition; Additive noise; Hidden Markov models; Reverberation; Speech; Speech recognition; Vectors; Robust speech recognition; noise model transfer; nonstationary noise; reverberation; vector Taylor series;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639031
Filename :
6639031
Link To Document :
بازگشت