• DocumentCode
    248175
  • Title

    Resolution limits on visual speech recognition

  • Author

    Bear, Helen L. ; Harvey, Richard ; Theobald, Barry-John ; Yuxuan Lan

  • Author_Institution
    Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    1371
  • Lastpage
    1375
  • Abstract
    Visual-only speech recognition is dependent upon a number of factors that can be difficult to control, such as: lighting; identity; motion; emotion and expression. But some factors, such as video resolution are controllable, so it is surprising that there is not yet a systematic study of the effect of resolution on lip-reading. Here we use a new data set, the Rosetta Raven data, to train and test recognizers so we can measure the affect of video resolution on recognition accuracy. We conclude that, contrary to common practice, resolution need not be that great for automatic lip-reading. However it is highly unlikely that automatic lip-reading can work reliably when the distance between the bottom of the lower lip and the top of the upper lip is less than four pixels at rest.
  • Keywords
    image recognition; image resolution; speech recognition; video signal processing; Rosetta Raven data; automatic lip-reading; lip reading; resolution limits; video resolution; visual speech recognition; Accuracy; Active appearance model; Face; Hidden Markov models; Lips; Shape; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2014 IEEE International Conference on
  • Conference_Location
    Paris
  • Type

    conf

  • DOI
    10.1109/ICIP.2014.7025274
  • Filename
    7025274