مرکز منطقه ای اطلاع رساني علوم و فناوري - LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise

DocumentCode :

3161578

Title :

LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise

Author :

Nakatani, Tomohiro ; Yoshioka, Takuya ; Araki, Shoko ; Delcroix, Marc ; Fujimoto, Masakiyo

Author_Institution :

NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4029

Lastpage :

4032

Abstract :

This paper proposes a new single/multi-channel speech enhancement approach based on a LogMax observation model integrated with Gaussian mixture models of speech and noise mel-frequency cepstral coefficients (MFCC-GMM). It has been reported that the LogMax observation model has high potential for reducing highly nonstationary noise, for example, when it is combined with factorial hidden Markov models. In addition, it has recently been shown that a source location based speech enhancement approach can be easily incorporated into this model for more efficient and reliable estimation. However, the unique structure of the LogMax model has prevented us from using it with MFCC-GMMs, which is a fundamental limitation of this approach. Our proposal in this paper is aimed at overcoming this limitation. Experiments using the PASCAL CHiME separation and recognition challenge task show the superiority of the proposed approach as regards both speech quality and automatic speech recognition performance.

Keywords :

Gaussian processes; hidden Markov models; speech enhancement; speech recognition; Gaussian mixture models; LogMax observation model; MFCC-GMM; PASCAL CHiME recognition; PASCAL CHiME separation; automatic speech recognition performance; factorial hidden Markov models; highly nonstationary ambient noise reduction; highly nonstationary noise reduction; noise mel-frequency cepstral coefficients; single-multichannel speech enhancement approach; source location based speech enhancement approach; speech mel-frequency cepstral coefficients; speech quality; Estimation; Hidden Markov models; Noise; Speech; Speech enhancement; Speech recognition; Training; Speech enhancement; automatic speech recognition; mel-frequency cepstral coefficients; model-based approach;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288802

Filename :

6288802

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3161578