DocumentCode
323594
Title
Multi-resolution cepstral features for phoneme recognition across speech sub-bands
Author
McCourt, Paul ; Vaseght, S. ; Harte, Naomi
Author_Institution
Sch. of Electr. Eng. & Comput. Sci., Queen´´s Univ., Belfast, UK
Volume
1
fYear
1998
fDate
12-15 May 1998
Firstpage
557
Abstract
Multi-resolution sub-band cepstral features strive to exploit discriminative cues in localised regions of the spectral domain by supplementing the full bandwidth cepstral features with sub-band cepstral features derived from several levels of sub-band decomposition. Multi-resolution feature vectors, formed by concatenation of the sub-band cepstral features into an extended feature vector, are shown to yield better performance than conventional MFCCs for phoneme recognition on the TIMIT database. Possible strategies for the recombination of partial recognition scores from independent multi-resolution sub-band models are explored. By exploiting the sub-band variations in signal to noise ratio for linearly weighted recombination of the log likelihood probabilities we obtained improved phoneme recognition performance in broadband noise compared to MFCC features. This is an advantage over a purely sub-band approach using non-linear recombination which is robust only to narrow band noise
Keywords
band-pass filters; cepstral analysis; feature extraction; probability; signal resolution; speech processing; speech recognition; MFCC features; SNR; TIMIT database; broadband noise; discriminative cues; full bandwidth cepstral features; linearly weighted recombination; localised regions; log likelihood probabilities; mel-filterbank cepstral coefficients; multi-resolution cepstral features; multi-resolution feature vectors; narrow band noise; nonlinear recombination; partial recognition scores; phoneme recognition; signal to noise ratio; spectral domain; speech sub-bands; sub-band cepstral features; sub-band decomposition; Acoustic noise; Cepstral analysis; Computer science; Humans; Mel frequency cepstral coefficient; Multiresolution analysis; Narrowband; Noise robustness; Signal to noise ratio; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.674491
Filename
674491
Link To Document