Title :
Generalization of temporal filter and linear transformation for robust speech recognition
Author :
Duc Hoang Ha Nguyen ; Xiong Xiao ; Eng Siong Chng ; Haizhou Li
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Abstract :
Temporal filtering of feature trajectories and linear transformation of feature vectors are two effective ways to compensate the speech features to achieve robust speech recognition in noisy and reverberant environments. In the previous studies, as the two methods are usually applied in sequence, the interaction between the two methods is not optimized. In this paper, we propose a generalized transform which integrates temporal filter and linear transformation into a single process. The new transform parameters are optimized to minimize an approximated Kullback-Leibler divergence between the distribution of the compensated features and the distribution represented by a clean reference model. The proposed method is evaluated on the Aurora-5 clean condition training task. The experiments show that the generalized transform significantly outperforms the simple cascade of temporal filtering and linear transformation. For example, the word accuracy is improved from 81.55% (cascade) to 83.99% (generalized) and from 72.09% (cascade) to 76.04% (generalized) for office and living room environments, respectively, in speaker based feature adaptation scheme.
Keywords :
filtering theory; speech recognition; transforms; Aurora-5 clean condition training task; approximated Kullback-Leibler divergence; clean reference model; compensated feature distribution; feature trajectory; feature vectors; generalized transform; linear transformation; noisy environments; reverberant environments; robust speech recognition; speaker based feature adaptation scheme; temporal filter generalization; Acoustics; Robustness; Speech; Speech processing; Speech recognition; Transforms; Vectors; Kullback-Leibler divergence; Robust speech recognition; linear transformation; reverberant speech recognition; temporal filter;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6853894