مرکز منطقه ای اطلاع رساني علوم و فناوري - A model of dynamic auditory perception and its application to robust word recognition

DocumentCode :

1548053

Title :

A model of dynamic auditory perception and its application to robust word recognition

Author :

Strope, Brian ; Alwan, Abeer

Author_Institution :

Dept. of Electr. Eng., California Univ., Los Angeles, CA, USA

Volume :

Issue :

fYear :

1997

fDate :

9/1/1997 12:00:00 AM

Firstpage :

451

Lastpage :

464

Abstract :

This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filterbank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously reported forward masking data, determine the model´s dynamic parameters. Once parameterized, the simple exponential dynamic mechanism predicts the nature of forward masking data from several studies across wide ranging frequencies, input levels, and probe delay times. An initial evaluation of the dynamic model together with a local peak isolation mechanism as a front end for dynamic time warp (DTW) and hidden Markov model (HMM) word recognition systems shows an improvement in robustness to background noise when compared to Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra (RASTA) based front ends

Keywords :

band-pass filters; filtering theory; hearing; hidden Markov models; noise; parameter estimation; prediction theory; spectral analysis; speech processing; speech recognition; HMM word recognition systems; Mel-frequency cepstral coefficients; additive logarithmic adaptation; automatic speech recognition front end; background noise; dynamic auditory perception; dynamic parameters; exponential dynamic mechanism; filter output; forward masking data; hidden Markov model; input levels; linear filterbank; linear prediction cepstral coefficients; local spectral peaks; local spectral peaks isolation; perceptual forward masking experiments; probe delay times; relative spectra; robust word recognition; Automatic speech recognition; Cepstral analysis; Delay; Filter bank; Frequency; Hidden Markov models; Noise robustness; Nonlinear filters; Predictive models; Probes;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.622569

Filename :

622569

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1548053