مرکز منطقه ای اطلاع رساني علوم و فناوري - Independent information from visual features for multimodal speech recognition

DocumentCode :

3102268

Title :

Independent information from visual features for multimodal speech recognition

Author :

Gurbuz, Sabri ; Tufekci, Zekeriya ; Patterson, Eric ; Gowdy, John N.

Author_Institution :

Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA

fYear :

2001

fDate :

2001

Firstpage :

221

Lastpage :

228

Abstract :

The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the additional information that is available from video features. Results are presented that demonstrate the distinct information available from a visual subsystem that will allow optimal joint-decisions based on the SNR-ratio and type of noise to exceed either audio or video subsystem in nearly all noisy environments

Keywords :

acoustic noise; feature extraction; image recognition; speech recognition; video signal processing; SNR-ratio; affine-invariant multimodal speech recognition system; audio features; audio subsystem; audio-based speech recognition systems; background noise; multimodal recognition system; multimodal speech recognition; optimal joint-decisions; video features; visual features; visual subsystem; Acoustic noise; Automatic speech recognition; Background noise; Degradation; Feature extraction; Humans; Speech enhancement; Speech recognition; System performance; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

SoutheastCon 2001. Proceedings. IEEE

Conference_Location :

Clemson, SC

Print_ISBN :

0-7803-6748-0

Type :

conf

DOI :

10.1109/SECON.2001.923119

Filename :

923119

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3102268