مرکز منطقه ای اطلاع رساني علوم و فناوري - Improved ROI and within frame discriminant features for lipreading

DocumentCode :

1672044

Title :

Improved ROI and within frame discriminant features for lipreading

Author :

Potamianos, Gerasimos ; Neti, Chalapathy

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume :

fYear :

2001

fDate :

6/23/1905 12:00:00 AM

Firstpage :

250

Abstract :

We study three aspects of designing appearance based visual features for automatic lipreading: (a) the choice of the video region of interest (ROI) on which image transform features are obtained; (b) the extraction of speech discriminant features at each frame; (c) the use of temporal information to improve visual speech modeling. With respect to (a), we propose a ROI that includes the speaker´s jaw and cheeks, in addition to the traditionally used mouth/lip region. With respect to (b) and (c), we propose the use of a two-stage linear discriminant analysis, both within a single frame and across a large number of frames. On a large-vocabulary, continuous-speech, audio-visual database, the proposed visual features result in a 13% absolute reduction in visual-only word error rate over a baseline visual front end, and in an additional 28% relative improvement in audio-visual over audio-only phonetic classification accuracy

Keywords :

discrete cosine transforms; feature extraction; image recognition; image sequences; speech recognition; audio-visual database; automatic lipreading; automatic speech recognition; continuous speech database; discrete cosine transform; discriminant features; large vocabulary database; linear discriminant analysis; speech discriminant feature extraction; temporal information; video region of interest; visual speech modeling; Algorithm design and analysis; Automatic speech recognition; Discrete cosine transforms; Discrete wavelet transforms; Feature extraction; Linear discriminant analysis; Mouth; Shape; Speech recognition; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Image Processing, 2001. Proceedings. 2001 International Conference on

Conference_Location :

Thessaloniki

Print_ISBN :

0-7803-6725-1

Type :

conf

DOI :

10.1109/ICIP.2001.958098

Filename :

958098

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1672044