• DocumentCode
    3329169
  • Title

    MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification

  • Author

    Bakry, Assem ; Elgammal, Ahmed

  • Author_Institution
    Comput. Sci. Dept., Rutgers Univ., Piscataway, NJ, USA
  • fYear
    2013
  • fDate
    23-28 June 2013
  • Firstpage
    684
  • Lastpage
    691
  • Abstract
    Visual speech recognition is a challenging problem, due to confusion between visual speech features. The speaker identification problem is usually coupled with speech recognition. Moreover, speaker identification is important to several applications, such as automatic access control, biometrics, authentication, and personal privacy issues. In this paper, we propose a novel approach for lip reading and speaker identification. We propose a new approach for manifold parameterization in a low-dimensional latent space, where each manifold is represented as a point in that space. We initially parameterize each instance manifold using a nonlinear mapping from a unified manifold representation. We then factorize the parameter space using Kernel Partial Least Squares (KPLS) to achieve a low-dimension manifold latent space. We use two-way projections to achieve two manifold latent spaces, one for the speech content and one for the speaker. We apply our approach on two public databases: AVLetters and OuluVS. We show the results for three different settings of lip reading: speaker independent, speaker dependent, and speaker semi-dependent. Our approach outperforms for the speaker semi-dependent setting by at least 15% of the baseline, and competes in the other two settings.
  • Keywords
    audio-visual systems; feature extraction; least squares approximations; speaker recognition; AVLetters; MKPLS; OuluVS; latent space; lipreading identification; manifold kernel partial least square; manifold latent space; manifold parameterization; nonlinear mapping; parameter space factorization; speaker identification; unified manifold representation; visual speech feature; visual speech recognition; Databases; Hidden Markov models; Kernel; Manifolds; Speech; Speech recognition; Visualization; AVLetters; KPLS; LBP; Lipreading; Low-dimensional embedding; MKPLS; Manifold Parameterization; OuluVs; PLS; Speaker identification; Visual speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2013.94
  • Filename
    6618938