DocumentCode
3527583
Title
Acoustic model combination to compensate for residual noise in multi-channel source separation
Author
Yoon, Jae Sam ; Park, Ji Hun ; Kim, Hong Kook
Author_Institution
Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju
fYear
2009
fDate
19-24 April 2009
Firstpage
3925
Lastpage
3928
Abstract
In this paper, we propose an acoustic model combination technique for reducing a mismatch in a multi-channel noisy environment. To this end, we first apply a mask-based multi-channel source separation method, typically computational auditory scene analysis (CASA), to separate the speech source from noise. However, a certain degree of noise remains in the separated speech source, especially under low signal-to-noise ratio (SNR) conditions since the estimated mask is not ideal. Thus, the performance of automatic speech recognition (ASR) is limited. To improve ASR performance, the remaining noise can be further compensated in the acoustic model domain under a framework of parallel model combination. In particular, a noise model for PMC is estimated from the noise remained after application of the mask-based source separation, and SNR for PMC is also estimated based on the average of relative magnitude of mask along the utterance. It is shown from the experiments that the proposed acoustic model combination method relatively reduces the word error rate by 52.14% compared to mask-based source separation alone.
Keywords
acoustic signal processing; source separation; speech recognition; ASR performance; PMC approach; acoustic model combination method; automatic speech recognition; computational auditory scene analysis; mask-based multichannel source separation; parallel model combination; residual noise; Acoustic noise; Automatic speech recognition; Image analysis; Noise reduction; Signal to noise ratio; Source separation; Speech analysis; Speech coding; Speech enhancement; Working environment noise; Speech recognition; computational auditory scene analysis; mask-based SNR estimation; mask-based noise model estimation; multi-channel source separation; parallel model combination;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location
Taipei
ISSN
1520-6149
Print_ISBN
978-1-4244-2353-8
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2009.4960486
Filename
4960486
Link To Document