DocumentCode :
417298
Title :
PCMM-based feature compensation schemes using model interpolation and mixture sharing
Author :
Kim, Wooil ; Kwon, Ohil ; Ko, Hanseok
Author_Institution :
Dept. of Electron. & Comput. Eng., Korea Univ., Seoul, South Korea
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
In this paper, we propose an effective feature compensation scheme based on the speech model in order to achieve robust speech recognition. The proposed feature compensation method is based on parallel combined mixture model (PCMM). The previous PCMM works require a highly sophisticated procedure for estimation of the combined mixture model in order to reflect the time-varying noisy conditions at every utterance. The proposed schemes can cope with the time-varying background noise by employing the interpolation method of the multiple mixture models. We apply the ´data-driven´ method to PCMM for more reliable model combination and introduce a frame-synched version for estimation of environments a posteriori. In order to reduce the computational complexity due to multiple models, we propose a technique for mixture sharing. The statistically similar Gaussian components are selected and the smoothed versions are generated for sharing. The performance was examined over Aurora 2.0 and speech corpus recorded while car-driving. The experimental results indicate that the proposed schemes are effective in realizing robust speech recognition and reducing the computational complexities under both simulated environments and real-life conditions.
Keywords :
Gaussian distribution; feature extraction; interpolation; parameter estimation; smoothing methods; speech recognition; Aurora 2.0; PCMM-based feature compensation; car-driving; computational complexity reduction; data-driven method; estimation; frame-synched version; mixture sharing; model interpolation; parallel combined mixture model; performance; robust speech recognition; smoothed versions; speech model; statistically similar Gaussian components; time-varying background noise; Background noise; Cepstral analysis; Computational complexity; Degradation; Interpolation; Maximum likelihood linear regression; Noise generators; Spatial databases; Speech recognition; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326154
Filename :
1326154
Link To Document :
بازگشت