مرکز منطقه ای اطلاع رساني علوم و فناوري - Robust detection of visual ROI for automatic speechreading

DocumentCode :

1833357

Title :

Robust detection of visual ROI for automatic speechreading

Author :

Iyengar, G. ; Potamianos, G. ; Neti, C. ; Faruquie, T. ; Verma, A.

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2001

fDate :

2001

Firstpage :

Lastpage :

Abstract :

We present our work on visual pruning in an audio-visual (AV) speech recognition scenario. Visual speech information has been successfully used in circumstances where audio-only recognition suffers (e.g. noisy environments). Tracking and extraction of region-of-interest (ROI) (e.g., speaker\´s mouth region) from video is an essential component of such systems. It is important for the visual front-end to handle tracking errors that result in noisy visual data and hamper performance. We present our robust visual front-end, investigate methods to prune visual noise and its effect on the performance of the AV speech recognition systems. Specifically, we estimate the "goodness of ROI" using Gaussian mixture models and our experiments indicate that significant performance gains are achieved with good quality visual data

Keywords :

Gaussian processes; audio-visual systems; feature extraction; image sequences; noise; speech recognition; tracking; video signal processing; AV speech recognition systems; Gaussian mixture models; audio-only recognition; audio-visual speech recognition; automatic recognition; automatic speechreading; noisy environments; noisy visual data; region-of-interest extraction; region-of-interest tracking; robust detection; tracking errors; video sequence; visual ROI; visual front-end; visual noise pruning; visual speech information; Automatic speech recognition; Detectors; Face detection; Facial features; Linear discriminant analysis; Lips; Mouth; Noise robustness; Speech recognition; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia Signal Processing, 2001 IEEE Fourth Workshop on

Conference_Location :

Cannes

Print_ISBN :

0-7803-7025-2

Type :

conf

DOI :

10.1109/MMSP.2001.962715

Filename :

962715

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1833357