• DocumentCode
    3280694
  • Title

    Scale based features for audiovisual speech recognition

  • Author

    Matthews, I.A. ; Bangham, J.A. ; Cox, S.J.

  • Author_Institution
    Sch. of Inf. Syst., East Anglia Univ., Norwich, UK
  • fYear
    1996
  • fDate
    35397
  • Firstpage
    42583
  • Lastpage
    42589
  • Abstract
    This paper demonstrates the use of nonlinear image decomposition, in the form of a sieve, applied to the task of audiovisual speech recognition of a database of the letters A-Z for ten talkers. A scale based feature vector is formed directly from the grayscale pixels of an image containing the talkers mouth on a per frame basis. This is independent of image amplitude and position information and neither accurate tracking or special markers are required. Results are presented for audio only, visual only and for early and late integrated audiovisual cases
  • Keywords
    audio-visual systems; audiovisual speech recognition; database; feature vector; grayscale pixels; image amplitude; nonlinear image decomposition; scale based features; sieve; tracking;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Integrated Audio-Visual Processing for Recognition, Synthesis and Communication (Digest No: 1996/213), IEE Colloquium on
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1049/ic:19961152
  • Filename
    645684