DocumentCode :
3455411
Title :
Combined speech decoders output for phoneme recognition enhancement
Author :
Abida, Kacem ; Karray, Fakhri ; Abida, Wafa
Author_Institution :
Electr. & Comput. Eng., Univ. of Waterloo, Waterloo, ON, Canada
fYear :
2009
fDate :
6-8 Nov. 2009
Firstpage :
1
Lastpage :
6
Abstract :
Phoneme recognition is an essential component of any robust speech decoder and has been tackled by many researchers. Speech feature extraction constitutes the front end module of any speech decoder: it plays an essential role and has a strong impact on the recognition performance. The research community is aggressively searching for more powerful solutions which combine the existing feature extraction methods for a better and more reliable information capture from the analog speech signal. In this research work, we propose new approaches to combining phoneme recognizers´ output in order to provide better recognition performance and improved robustness with respect to noise and channel distortions. Machine learning tools such as the naive Bayes classifier, decision trees, and support vector machines have been used in the combination of hypotheses. Experiments under different SNR levels have proven that our proposed approach outperforms the two most common feature extraction techniques, namely mel frequency cepstral coefficients (MFCC) and perceptual linear prediction (PLP) with cepstral mean subtraction (CMS) and RASTA respectively, for channel normalization.
Keywords :
Bayes methods; audio signal processing; decision trees; distortion; feature extraction; learning (artificial intelligence); speech coding; speech enhancement; speech recognition; support vector machines; analog speech signal; cepstral mean subtraction; channel distortion; channel normalization; decision trees; machine learning; mel frequency cepstral coefficients; naive Bayes classifier; noise distortion; perceptual linear prediction; phoneme recognition enhancement; robust speech decoder; speech feature extraction; support vector machines; Classification tree analysis; Decision trees; Decoding; Feature extraction; Machine learning; Mel frequency cepstral coefficient; Noise robustness; Speech enhancement; Speech recognition; Support vector machines; Phoneme recognition; SNR; decoder combination; feature extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signals, Circuits and Systems (SCS), 2009 3rd International Conference on
Conference_Location :
Medenine
Print_ISBN :
978-1-4244-4397-0
Electronic_ISBN :
978-1-4244-4398-7
Type :
conf
DOI :
10.1109/ICSCS.2009.5412292
Filename :
5412292
Link To Document :
بازگشت