مرکز منطقه ای اطلاع رساني علوم و فناوري - Temporal Modulation Normalization for Robust Speech Feature Extraction and Recognition

DocumentCode :

2150112

Title :

Temporal Modulation Normalization for Robust Speech Feature Extraction and Recognition

Author :

Lu, Xugang ; Matsuda, Shigeki ; Unoki, Masashi ; Nakamura, Satoshi

Author_Institution :

Nat. Inst. of Inf. & Commun. Technol., Japan

fYear :

2009

fDate :

17-19 Oct. 2009

Firstpage :

Lastpage :

Abstract :

Traditional noise reduction methods usually are based on the assumption that the short-term statistical distributions of speech and noise are different. Differently from that assumption, we have proposed a noise reduction method based on the assumption that the temporal modulations of noise and speech are different. Two steps are used in the proposed algorithm: one is the temporal modulation contrast normalization, another is the modulation events preserved smoothing. Since our proposed method can be used independently for noise reduction, it can be combined with the traditional noise reduction methods to further reduce the noise effect. We tested our proposed method as a front-end for robust speech recognition. Two advanced noise reduction methods, ETSI advanced front-end (AFE) method, and particle filtering (PF) with minimum mean square error (MMSE) estimation method, for comparison and combinations. Experimental results showed that our proposed method outperforms the advanced methods as an independent front-end processor, and further improved the performance consistently than using each method independently as combined front-ends.

Keywords :

feature extraction; least mean squares methods; particle filtering (numerical methods); signal denoising; smoothing methods; speech intelligibility; speech processing; speech recognition; statistical distributions; AFE; ETSI advanced front-end method; MMSE; independent front-end processor; minimum mean square error estimation; modulation event preserved smoothing; noise reduction; particle filtering; robust speech feature extraction; robust speech recognition; short-term statistical distribution; speech intelligibility; temporal modulation contrast normalization; Feature extraction; Filtering; Noise reduction; Noise robustness; Smoothing methods; Speech enhancement; Speech recognition; Statistical distributions; Telecommunication standards; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Image and Signal Processing, 2009. CISP '09. 2nd International Congress on

Conference_Location :

Tianjin

Print_ISBN :

978-1-4244-4129-7

Electronic_ISBN :

978-1-4244-4131-0

Type :

conf

DOI :

10.1109/CISP.2009.5303903

Filename :

5303903

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2150112