Title :
Hierarchical hybrid statistic based video binary code and its application to face retrieval in TV-series
Author :
Yan Li ; Ruiping Wang ; Shiguang Shan ; Xilin Chen
Author_Institution :
Key Lab. of Intell. Inf. Process. of Chinese Acad. of Sci. (CAS), Inst. of Comput. Technol., Beijing, China
Abstract :
We address the problem of video face retrieval in TV-Series, which searches video clips based on the presence of particular character, given one video clip of his/hers. This is tremendously challenging because on one hand, faces in TV-Series are captured in largely uncontrolled conditions with complex appearance variations, and on the other hand retrieval task typically needs highly efficient representation with low time and space complexity. To handle such problems, we propose a compact and discriminative binary representation for the huge body of video data based on a novel hierarchical hybrid statistic. Our method, named Hierarchical Hybrid Statistic based Video Binary Code (HHSVBC), first utilizes different parameterized Fisher Vectors (FVs) as frame representation that can encode multi-granularity low level variation information within the frame, and then models the video by its frame covariance matrix to capture high level variation information among video frames. To incorporate discriminative information and obtain more compact video signature, the high-dimensional video representation is further encoded to a much lower-dimensional binary vector, which finally yields the proposed HHSVBC. Specifically, each bit of the code, is produced via supervised learning in a max margin framework, which aims to make a trade-off between code discriminability and stability. Face retrieval experiments on two challenging large scale TV-Series video databases demonstrate the competitiveness of the proposed HHSVBC over state-of-the-art retrieval methods.
Keywords :
computational complexity; face recognition; image representation; video coding; video databases; video retrieval; FV; HHSVBC; TV-series; compact video signature; complex appearance variations; discriminative binary representation; frame covariance matrix; frame representation; hierarchical hybrid statistic based video binary code; high-dimensional video representation; large scale TV-series video databases; max margin framework; multigranularity low level variation information; parameterized Fisher vectors; space complexity; time complexity; uncontrolled conditions; video clips; video data; video face retrieval; Binary codes; Covariance matrices; Databases; Face; Kernel; Optimization; Training;
Conference_Titel :
Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on
Conference_Location :
Ljubljana
DOI :
10.1109/FG.2015.7163089