مرکز منطقه ای اطلاع رساني علوم و فناوري - SNR-Adaptive Stream Weighting for Audio-MES ASR

DocumentCode :

809414

Title :

SNR-Adaptive Stream Weighting for Audio-MES ASR

Author :

Lee, Ki-Seung

Author_Institution :

Dept. of Electron. Eng., Konkuk Univ., Seoul

Volume :

Issue :

fYear :

2008

Firstpage :

2001

Lastpage :

2010

Abstract :

Myoelectric signals (MESs) from the speaker´s mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES observation vector was given by a linear combination of class-conditional observation log-likelihoods of two classifiers, using appropriate weights. We developed a weighting process adaptive to SNRs. The main objective of the paper involves determining the optimal SNR classification boundaries and constructing a set of optimum stream weights for each SNR class. These two parameters were determined by a method based on a maximum mutual information criterion. Acoustic and facial MES data were collected from five subjects, using a 60-word vocabulary. Four types of acoustic noise including babble, car, aircraft, and white noise were acoustically added to clean speech signals with SNR ranging from -14 to 31 dB. The classification accuracy of the audio ASR was as low as 25.5%. Whereas, the classification accuracy of the MES ASR was 85.2%. The classification accuracy could be further improved by employing the proposed audio-MES weighting method, which was as high as 89.4% in the case of babble noise. A similar result was also found for the other types of noise.

Keywords :

electromyography; medical signal processing; speech recognition; SNR-adaptive stream weighting; acoustic noise; audio MES feature; audio-MES ASR; automatic speech recognizers; babble noise; decision fusion method; facial MES feature; myoelectric signals; Automatic speech recognition; Face recognition; Mouth; Mutual information; Noise robustness; Signal to noise ratio; Speech enhancement; Streaming media; Usability; Vectors; Automatic speech recognition; decision fusion; maximum mutualinformation (MMI) criterion; myoelectric signals (MESs); optimalweighting; Algorithms; Artificial Intelligence; Electromyography; Humans; Pattern Recognition, Automated; Speech Production Measurement; Speech Recognition Software;

fLanguage :

English

Journal_Title :

Biomedical Engineering, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9294

Type :

jour

DOI :

10.1109/TBME.2008.921094

Filename :

4567620

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=809414