DocumentCode
774791
Title
Accurate compensation in the log-spectral domain for noisy speech recognition
Author
Afify, Mohamed
Author_Institution
BBN Syst. & Technol. Corp., Cambridge, MA, USA
Volume
13
Issue
3
fYear
2005
fDate
5/1/2005 12:00:00 AM
Firstpage
388
Lastpage
398
Abstract
This paper presents a new algorithm for noise compensation in the log-spectral domain. We first note that using a Gaussian mixture assumption a compensation algorithm in the log-spectral domain is completely defined by three parameters for each Gaussian component: the noisy speech mean, the noisy speech variance, and the covariance of clean and noisy speech. Starting from a well known mismatch function we propose two new approximations which allow deriving analytical expressions for the above mentioned parameters, and hence develop a new noise compensation algorithm in the log-spectral domain. In addition to theoretical derivations we discuss implementation issues of the proposed method and analyze its computational complexity. Experimental results for digit recognition in the car reveal that the proposed technique significantly outperform the baseline, and first order VTS. For example at 10 db signal to noise ratio the baseline, first order VTS, and the proposed method lead to recognition accuracies 82.6%, 85.5%, and 90.1%. The superiority of the proposed method to VTS can be attributed to the accuracy of the employed approximations. The compensation algorithm is also found to be more accurate and faster than an approximate numerical integration technique.
Keywords
Gaussian noise; approximation theory; computational complexity; speech recognition; Gaussian mixture assumption; computational complexity; digit recognition; log-spectral domain; mismatch function; noise compensation algorithm; noisy speech covariance; noisy speech mean; noisy speech recognition; noisy speech variance; Additive noise; Algorithm design and analysis; Distortion measurement; Gaussian noise; Mel frequency cepstral coefficient; Noise robustness; Speech enhancement; Speech recognition; Statistics; Taylor series; Log-spectral domain; noise compensation; robust speech recognition; vector Taylor series;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/TSA.2005.845811
Filename
1420373
Link To Document