Title :
LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise
Author :
Nakatani, Tomohiro ; Yoshioka, Takuya ; Araki, Shoko ; Delcroix, Marc ; Fujimoto, Masakiyo
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
Abstract :
This paper proposes a new single/multi-channel speech enhancement approach based on a LogMax observation model integrated with Gaussian mixture models of speech and noise mel-frequency cepstral coefficients (MFCC-GMM). It has been reported that the LogMax observation model has high potential for reducing highly nonstationary noise, for example, when it is combined with factorial hidden Markov models. In addition, it has recently been shown that a source location based speech enhancement approach can be easily incorporated into this model for more efficient and reliable estimation. However, the unique structure of the LogMax model has prevented us from using it with MFCC-GMMs, which is a fundamental limitation of this approach. Our proposal in this paper is aimed at overcoming this limitation. Experiments using the PASCAL CHiME separation and recognition challenge task show the superiority of the proposed approach as regards both speech quality and automatic speech recognition performance.
Keywords :
Gaussian processes; hidden Markov models; speech enhancement; speech recognition; Gaussian mixture models; LogMax observation model; MFCC-GMM; PASCAL CHiME recognition; PASCAL CHiME separation; automatic speech recognition performance; factorial hidden Markov models; highly nonstationary ambient noise reduction; highly nonstationary noise reduction; noise mel-frequency cepstral coefficients; single-multichannel speech enhancement approach; source location based speech enhancement approach; speech mel-frequency cepstral coefficients; speech quality; Estimation; Hidden Markov models; Noise; Speech; Speech enhancement; Speech recognition; Training; Speech enhancement; automatic speech recognition; mel-frequency cepstral coefficients; model-based approach;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288802