Title :
MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments
Author :
Suzuki, Masayuki ; Yoshioka, Takuya ; Watanabe, Shinji ; Minematsu, Nobuaki ; Hirose, Keikichi
Author_Institution :
Univ. of Tokyo, Tokyo, Japan
Abstract :
One of the most effective approaches to noise robust speech recognition is to remove the noise effect directly from corrupted MFCC vectors. However, VTS enhancement, which is a typical method for performing MFCC enhancement, provides limited improvement when the noise is highly non-stationary. This is because the VTS enhancement method cannot use a time-varying noise model to keep the computational cost at an acceptable level. This paper proposes a method that can enhance MFCC vectors and their dynamic parameters by using noise estimates that change on a frame-by-frame basis at a practical computational cost. The proposed method employs stereo data-based feature mapping like the well known SPLICE algorithm. The novelty of the proposed method lies in that it uses the joint space spanned by a concatenated vector of corrupted and noise features. It is also proposed to use linear discriminant analysis to effectively reduce the dimensionality of the joint space. The proposed method achieves 19.1% and 8.3% relative error reduction from the SPLICE and noise-mean normalized SPLICE algorithms, respectively.
Keywords :
approximation theory; cepstral analysis; speech recognition; MFCC vector enhancement; VTS enhancement method; computational cost; corrupted concatenated vector; highly nonstationary noise environments; joint corrupted space; linear discriminant analysis; mel frequency ceptral coefficients; noise feature concatenated vector; noise feature space; noise robust speech recognition; noise-mean normalized SPLICE algorithms; stereo data-based feature mapping; time-varying noise model; vector Taylor series approximation-based algorithms; Accuracy; Joints; Mel frequency cepstral coefficient; Noise; Speech; Speech recognition; Vectors; Noise robust ASR; SPLICE; non-stationary noise;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288822