DocumentCode :
323594
Title :
Multi-resolution cepstral features for phoneme recognition across speech sub-bands
Author :
McCourt, Paul ; Vaseght, S. ; Harte, Naomi
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Queen´´s Univ., Belfast, UK
Volume :
1
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
557
Abstract :
Multi-resolution sub-band cepstral features strive to exploit discriminative cues in localised regions of the spectral domain by supplementing the full bandwidth cepstral features with sub-band cepstral features derived from several levels of sub-band decomposition. Multi-resolution feature vectors, formed by concatenation of the sub-band cepstral features into an extended feature vector, are shown to yield better performance than conventional MFCCs for phoneme recognition on the TIMIT database. Possible strategies for the recombination of partial recognition scores from independent multi-resolution sub-band models are explored. By exploiting the sub-band variations in signal to noise ratio for linearly weighted recombination of the log likelihood probabilities we obtained improved phoneme recognition performance in broadband noise compared to MFCC features. This is an advantage over a purely sub-band approach using non-linear recombination which is robust only to narrow band noise
Keywords :
band-pass filters; cepstral analysis; feature extraction; probability; signal resolution; speech processing; speech recognition; MFCC features; SNR; TIMIT database; broadband noise; discriminative cues; full bandwidth cepstral features; linearly weighted recombination; localised regions; log likelihood probabilities; mel-filterbank cepstral coefficients; multi-resolution cepstral features; multi-resolution feature vectors; narrow band noise; nonlinear recombination; partial recognition scores; phoneme recognition; signal to noise ratio; spectral domain; speech sub-bands; sub-band cepstral features; sub-band decomposition; Acoustic noise; Cepstral analysis; Computer science; Humans; Mel frequency cepstral coefficient; Multiresolution analysis; Narrowband; Noise robustness; Signal to noise ratio; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.674491
Filename :
674491
Link To Document :
بازگشت