DocumentCode :
2178448
Title :
On the use of ideal binary masks for improving phonetic classification
Author :
Narayanan, Arun ; Wang, DeLiang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5212
Lastpage :
5215
Abstract :
Ideal binary masks are binary patterns that encode the masking characteristics of speech in noise. Recent evidence in speech perception suggests that such binary patterns provide sufficient information for human speech recognition. Motivated by these findings, we propose to use ideal binary masks to improve phonetic modeling. We show that by combining the outputs of classifiers trained on the traditional MFCC features and this novel speech pattern, statistically significant improvements over the baseline MFCC based classifier can be achieved for the task of phonetic classification. Using the combined classifiers, we achieve an error rate of 19.5% on the TIMIT phonetic classification task using multilayer perceptrons as the underlying classifier.
Keywords :
speech recognition; MFCC; TIMIT phonetic classification; binary masks; binary patterns; human speech recognition; phonetic classification; Error analysis; Mel frequency cepstral coefficient; Signal to noise ratio; Speech; Speech recognition; Training; CASA; Speech recognition; TIMIT; ideal binary mask; phone classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947532
Filename :
5947532
Link To Document :
بازگشت