• DocumentCode
    3420407
  • Title

    Unsupervised Random Forest Manifold Alignment for Lipreading

  • Author

    Yuru Pei ; Tae-Kyun Kim ; Hongbin Zha

  • Author_Institution
    Peking Univ., Beijing, China
  • fYear
    2013
  • fDate
    1-8 Dec. 2013
  • Firstpage
    129
  • Lastpage
    136
  • Abstract
    Lip reading from visual channels remains a challenging topic considering the various speaking characteristics. In this paper, we address an efficient lip reading approach by investigating the unsupervised random forest manifold alignment (RFMA). The density random forest is employed to estimate affinity of patch trajectories in speaking facial videos. We propose novel criteria for node splitting to avoid the rank-deficiency in learning density forests. By virtue of the hierarchical structure of random forests, the trajectory affinities are measured efficiently, which are used to find embeddings of the speaking video clips by a graph-based algorithm. Lip reading is formulated as matching between manifolds of query and reference video clips. We employ the manifold alignment technique for matching, where the L-norm-based manifold-to-manifold distance is proposed to find the matching pairs. We apply this random forest manifold alignment technique to various video data sets captured by consumer cameras. The experiments demonstrate that lip reading can be performed effectively, and outperform state-of-the-arts.
  • Keywords
    learning (artificial intelligence); video signal processing; L-norm-based manifold-to-manifold distance; RFMA; consumer cameras; density random forest; graph-based algorithm; hierarchical structure; learning density forests; lipreading approach; matching pairs; patch trajectories; query manifolds; random forest manifold alignment technique; rank-deficiency; reference video clips; speaking characteristics; speaking facial videos; speaking video clips; unsupervised random forest manifold alignment; visual channels; Covariance matrices; Image color analysis; Manifolds; Shape; Trajectory; Vegetation; Videos; Lipreading; Manifold Alignment; RFMA; Unsupervised Random Forest;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision (ICCV), 2013 IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-5499
  • Type

    conf

  • DOI
    10.1109/ICCV.2013.23
  • Filename
    6751125