DocumentCode
3280694
Title
Scale based features for audiovisual speech recognition
Author
Matthews, I.A. ; Bangham, J.A. ; Cox, S.J.
Author_Institution
Sch. of Inf. Syst., East Anglia Univ., Norwich, UK
fYear
1996
fDate
35397
Firstpage
42583
Lastpage
42589
Abstract
This paper demonstrates the use of nonlinear image decomposition, in the form of a sieve, applied to the task of audiovisual speech recognition of a database of the letters A-Z for ten talkers. A scale based feature vector is formed directly from the grayscale pixels of an image containing the talkers mouth on a per frame basis. This is independent of image amplitude and position information and neither accurate tracking or special markers are required. Results are presented for audio only, visual only and for early and late integrated audiovisual cases
Keywords
audio-visual systems; audiovisual speech recognition; database; feature vector; grayscale pixels; image amplitude; nonlinear image decomposition; scale based features; sieve; tracking;
fLanguage
English
Publisher
iet
Conference_Titel
Integrated Audio-Visual Processing for Recognition, Synthesis and Communication (Digest No: 1996/213), IEE Colloquium on
Conference_Location
London
Type
conf
DOI
10.1049/ic:19961152
Filename
645684
Link To Document