DocumentCode
3363
Title
Feature Enhancement With Joint Use of Consecutive Corrupted and Noise Feature Vectors With Discriminative Region Weighting
Author
Suzuki, M. ; Yoshioka, Takashi ; Watanabe, Shigetaka ; Minematsu, Nobuaki ; Hirose, Keikichi
Author_Institution
Dept. of Electr. Eng. & Inf. Syst., Univ. of Tokyo, Tokyo, Japan
Volume
21
Issue
10
fYear
2013
fDate
Oct. 2013
Firstpage
2172
Lastpage
2181
Abstract
This paper proposes a feature enhancement method that can achieve high speech recognition performance in a variety of noise environments with feasible computational cost. As the well-known Stereo-based Piecewise Linear Compensation for Environments (SPLICE) algorithm, the proposed method learns piecewise linear transformation to map corrupted feature vectors to the corresponding clean features, which enables efficient operation. To make the feature enhancement process adaptive to changes in noise, the piecewise linear transformation is performed by using a subspace of the joint space of corrupted and noise feature vectors, where the subspace is chosen such that classes (i.e., Gaussian mixture components) of underlying clean feature vectors can be best predicted. In addition, we propose utilizing temporally adjacent frames of corrupted and noise features in order to leverage dynamic characteristics of feature vectors. To prevent overfitting caused by the high dimensionality of the extended feature vectors covering the neighboring frames, we introduce regularized weighted minimum mean square error criterion. The proposed method achieved relative improvements of 34.2% and 22.2% over SPLICE under the clean and multi-style conditions, respectively, on the Aurora 2 task.
Keywords
least mean squares methods; piecewise linear techniques; speech recognition; Aurora 2 task; SPLICE algorithm; consecutive corrupted vector; corrupted feature vectors mapping; discriminative region weighting; feature enhancement method; high speech recognition performance; noise feature vector; piecewise linear transformation; stereo-based piecewise linear compensation for environment; weighted minimum mean square error criterion; Feature enhancement; SPLICE; noise robust automatic speech recognition; non-stationary noise; vector Taylor series;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2013.2270407
Filename
6544587
Link To Document