Title :
PCMM-based feature compensation schemes using model interpolation and mixture sharing
Author :
Kim, Wooil ; Kwon, Ohil ; Ko, Hanseok
Author_Institution :
Dept. of Electron. & Comput. Eng., Korea Univ., Seoul, South Korea
Abstract :
In this paper, we propose an effective feature compensation scheme based on the speech model in order to achieve robust speech recognition. The proposed feature compensation method is based on parallel combined mixture model (PCMM). The previous PCMM works require a highly sophisticated procedure for estimation of the combined mixture model in order to reflect the time-varying noisy conditions at every utterance. The proposed schemes can cope with the time-varying background noise by employing the interpolation method of the multiple mixture models. We apply the ´data-driven´ method to PCMM for more reliable model combination and introduce a frame-synched version for estimation of environments a posteriori. In order to reduce the computational complexity due to multiple models, we propose a technique for mixture sharing. The statistically similar Gaussian components are selected and the smoothed versions are generated for sharing. The performance was examined over Aurora 2.0 and speech corpus recorded while car-driving. The experimental results indicate that the proposed schemes are effective in realizing robust speech recognition and reducing the computational complexities under both simulated environments and real-life conditions.
Keywords :
Gaussian distribution; feature extraction; interpolation; parameter estimation; smoothing methods; speech recognition; Aurora 2.0; PCMM-based feature compensation; car-driving; computational complexity reduction; data-driven method; estimation; frame-synched version; mixture sharing; model interpolation; parallel combined mixture model; performance; robust speech recognition; smoothed versions; speech model; statistically similar Gaussian components; time-varying background noise; Background noise; Cepstral analysis; Computational complexity; Degradation; Interpolation; Maximum likelihood linear regression; Noise generators; Spatial databases; Speech recognition; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326154