• DocumentCode
    2267279
  • Title

    An extended multiresolution approach to mouth specific AAM fitting for Speech Recognition

  • Author

    Berry, Craig ; Kokaram, Anil ; Harte, Naomi

  • Author_Institution
    Dept. of Electron. & Electr. Eng., Trinity Coll., Dublin, Ireland
  • fYear
    2011
  • fDate
    Aug. 29 2011-Sept. 2 2011
  • Firstpage
    1959
  • Lastpage
    1963
  • Abstract
    Active Appearance Models (AAMs) are a widely used technique for face tracking. They work by minimising the difference between an unobserved image and a synthetically generated image created by a statistical model of the deformable object, e.g. a face. The Fixed Jacobian algorithm is the most widely used algorithm for fitting AAMs. A Gaussian Image Pyramid fitting technique is used in order to make this process more robust and computationally faster. This paper presents a new image pyramid fitting structure specifically developed for application to an Audio-Visual Speach Recognition (AVSR) system in which the area described by the AAM is reduced as the iterations progress. This allows the fitting technique to be more accurate as the fitting progresses. The new fitting structure is implemented with the Fixed Jacobian algorithm and then compared to a standard approach where the mouth shape is extracted from a full face AAM. The test is performed using images from the CUAVE database. The new structure is shown to be more accurate and robust than the full face approach, with a 14.10% increase in the convergence of the mouth points to within a 4 pixel average difference, while also achieving a 8.53% improvement in the accuracy of the fit.
  • Keywords
    Gaussian processes; curve fitting; iterative methods; speech recognition; statistical analysis; AAM; AVSR; Audio-Visual Speach Recognition system; CUAVE database; Gaussian image pyramid fitting technique; active appearance models; deformable object; extended multiresolution approach; face tracking; fixed Jacobian algorithm; iterations progress; speech recognition; statistical model; Active appearance model; Face; Image resolution; Jacobian matrices; Mouth; Principal component analysis; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2011 19th European
  • Conference_Location
    Barcelona
  • ISSN
    2076-1465
  • Type

    conf

  • Filename
    7074009