مرکز منطقه ای اطلاع رساني علوم و فناوري - Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition

DocumentCode :

1688202

Title :

Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition

Author :

Yoshioka, Takashi ; Nakatani, Takeshi

Author_Institution :

NTT Commun. Sci. Labs., NTT Corp., Keihanna Science City, Japan

fYear :

2013

Firstpage :

7058

Lastpage :

7062

Abstract :

This paper considers using the feature enhancement approach for automatic recognition of speech corrupted by severely nonstationary noise, caused for example by interfering talkers and inter-frame distortion induced by reverberation. In particular, we focus on the issue of feature-domain noise model estimation and investigate a recently proposed approach, called noise model transfer (NMT), for estimating the rapidly changing noise model parameter values. Based on the fact that noise spectral changes can be detected more easily in the power spectrum domain than in the feature domain, NMT estimates the noise model parameter values for each time frame by using both observed feature vectors and noise power spectral estimates, on the assumption that a separate noise power spectrum estimator is available. This is achieved by finding the best transformation that maps the power spectra onto the noise model parameter space in the maximum likelihood sense. Whereas the transformation was previously modeled using a bias vector, this paper employs a more flexible affine transformation model. The results of 20,000-word reverberant speech recognition experiments show the advantage of the affine transformation model.

Keywords :

distortion; maximum likelihood estimation; reverberation; spectral analysis; speech enhancement; speech recognition; vectors; NMT; affine transformation model; automatic speech recognition; bias vector; feature enhancement approach; feature-domain noise model estimation; inter-frame distortion; maximum likelihood sense; noise model parameter value estimation; noise model transfer; noise power spectrum estimator; noise spectral change detection; nonstationary noise; power spectrum domain; reverberation; vocabulary reverberant speech recognition; Additive noise; Hidden Markov models; Reverberation; Speech; Speech recognition; Vectors; Robust speech recognition; noise model transfer; nonstationary noise; reverberation; vector Taylor series;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639031

Filename :

6639031

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1688202