مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-visual isolated digit recognition for whispered speech

DocumentCode :

695600

Title :

Audio-visual isolated digit recognition for whispered speech

Author :

Xing Fan ; Busso, Carlos ; Hansen, John H. L.

Author_Institution :

Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA

fYear :

2011

fDate :

Aug. 29 2011-Sept. 2 2011

Firstpage :

1500

Lastpage :

1503

Abstract :

Whisper is used by talkers intentionally in certain circumstances to protect personal privacy. Due to the absence of periodic excitation in the production of whisper, there are considerable differences between neutral and whispered speech in the spectral structure. Therefore, performance of speech recognition systems trained with high energy voiced phonemes, degrades significantly when tested with whisper. In this study, we investigate the use of multi-streammodels in isolated digit recognition of whispered speech. A small digit corpus with one subject speaking both whisper and neutral speech is collected. The eigenlips approach is used to extract visual features describing the lips appearance. MFCCs are employed as feature set for speech. Two HMM systems are trained for each stream independently and their scores are linearly combined. The resulted word accuracy shows significant improvement (37%, absolute). The study represents one of the first advancements in whisper recognition using audiovisual features. It also supports the use of multistream HMM to improve the performance on whisper/neutral speech conditions.

Keywords :

audio-visual systems; cepstral analysis; feature extraction; hidden Markov models; speech recognition; HMM system; MFCC; audio-visual isolated digit recognition; eigenlip approach; energy voiced phonemes; multistream model; visual feature extraction; whisper recognition; whispered speech recognition system; Accuracy; Feature extraction; Hidden Markov models; Speech; Speech recognition; Vectors; Visualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing Conference, 2011 19th European

Conference_Location :

Barcelona

ISSN :

2076-1465

Type :

conf

Filename :

7073972

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=695600