مرکز منطقه ای اطلاع رساني علوم و فناوري - Combined speech decoders output for phoneme recognition enhancement

DocumentCode :

3455411

Title :

Combined speech decoders output for phoneme recognition enhancement

Author :

Abida, Kacem ; Karray, Fakhri ; Abida, Wafa

Author_Institution :

Electr. & Comput. Eng., Univ. of Waterloo, Waterloo, ON, Canada

fYear :

2009

fDate :

6-8 Nov. 2009

Firstpage :

Lastpage :

Abstract :

Phoneme recognition is an essential component of any robust speech decoder and has been tackled by many researchers. Speech feature extraction constitutes the front end module of any speech decoder: it plays an essential role and has a strong impact on the recognition performance. The research community is aggressively searching for more powerful solutions which combine the existing feature extraction methods for a better and more reliable information capture from the analog speech signal. In this research work, we propose new approaches to combining phoneme recognizers´ output in order to provide better recognition performance and improved robustness with respect to noise and channel distortions. Machine learning tools such as the naive Bayes classifier, decision trees, and support vector machines have been used in the combination of hypotheses. Experiments under different SNR levels have proven that our proposed approach outperforms the two most common feature extraction techniques, namely mel frequency cepstral coefficients (MFCC) and perceptual linear prediction (PLP) with cepstral mean subtraction (CMS) and RASTA respectively, for channel normalization.

Keywords :

Bayes methods; audio signal processing; decision trees; distortion; feature extraction; learning (artificial intelligence); speech coding; speech enhancement; speech recognition; support vector machines; analog speech signal; cepstral mean subtraction; channel distortion; channel normalization; decision trees; machine learning; mel frequency cepstral coefficients; naive Bayes classifier; noise distortion; perceptual linear prediction; phoneme recognition enhancement; robust speech decoder; speech feature extraction; support vector machines; Classification tree analysis; Decision trees; Decoding; Feature extraction; Machine learning; Mel frequency cepstral coefficient; Noise robustness; Speech enhancement; Speech recognition; Support vector machines; Phoneme recognition; SNR; decoder combination; feature extraction;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signals, Circuits and Systems (SCS), 2009 3rd International Conference on

Conference_Location :

Medenine

Print_ISBN :

978-1-4244-4397-0

Electronic_ISBN :

978-1-4244-4398-7

Type :

conf

DOI :

10.1109/ICSCS.2009.5412292

Filename :

5412292

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3455411