DocumentCode
3420407
Title
Unsupervised Random Forest Manifold Alignment for Lipreading
Author
Yuru Pei ; Tae-Kyun Kim ; Hongbin Zha
Author_Institution
Peking Univ., Beijing, China
fYear
2013
fDate
1-8 Dec. 2013
Firstpage
129
Lastpage
136
Abstract
Lip reading from visual channels remains a challenging topic considering the various speaking characteristics. In this paper, we address an efficient lip reading approach by investigating the unsupervised random forest manifold alignment (RFMA). The density random forest is employed to estimate affinity of patch trajectories in speaking facial videos. We propose novel criteria for node splitting to avoid the rank-deficiency in learning density forests. By virtue of the hierarchical structure of random forests, the trajectory affinities are measured efficiently, which are used to find embeddings of the speaking video clips by a graph-based algorithm. Lip reading is formulated as matching between manifolds of query and reference video clips. We employ the manifold alignment technique for matching, where the L∞-norm-based manifold-to-manifold distance is proposed to find the matching pairs. We apply this random forest manifold alignment technique to various video data sets captured by consumer cameras. The experiments demonstrate that lip reading can be performed effectively, and outperform state-of-the-arts.
Keywords
learning (artificial intelligence); video signal processing; L∞-norm-based manifold-to-manifold distance; RFMA; consumer cameras; density random forest; graph-based algorithm; hierarchical structure; learning density forests; lipreading approach; matching pairs; patch trajectories; query manifolds; random forest manifold alignment technique; rank-deficiency; reference video clips; speaking characteristics; speaking facial videos; speaking video clips; unsupervised random forest manifold alignment; visual channels; Covariance matrices; Image color analysis; Manifolds; Shape; Trajectory; Vegetation; Videos; Lipreading; Manifold Alignment; RFMA; Unsupervised Random Forest;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision (ICCV), 2013 IEEE International Conference on
Conference_Location
Sydney, NSW
ISSN
1550-5499
Type
conf
DOI
10.1109/ICCV.2013.23
Filename
6751125
Link To Document