Author_Institution :
Dept. of Electr., Comput. & Syst. Eng., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
The tracking of facial activities from video is an important and challenging problem. Nowadays, many computer vision techniques have been proposed to characterize the facial activities in the following three levels (from local to global): First, in the bottom level, the facial feature tracking focuses on detecting and tracking the prominent local landmarks surrounding facial components (e.g. mouth, eyebrow, etc); Second, the facial action units (AUs) characterize the specific behaviors of these local facial components (e.g. mouth open, eyebrow raiser, etc); Finally, facial expression, which is a representation of the subjects´ emotion (e.g. Surprise, Happy, Anger, etc.), controls the global muscular movement of the whole face. Most of the existing methods focus on one or two levels of facial activities, and track (or recognize) them separately. In this paper, we propose to exploit the relationships among multi-level facial activities and track the facial activities in the three levels simultaneously. Specifically, we propose a unified stochastic framework based on the Dynamic Bayesian network (DBN) to explicitly represent the facial evolvements in different levels, their interactions and their observations. By modeling the relationships among the three level facial activities, the proposed method can improve the tracking (or recognition) performance in all three levels.
Keywords :
belief networks; computer vision; emotion recognition; face recognition; feature extraction; image representation; tracking; video signal processing; computer vision technique; dynamic Bayesian network; emotion representation; facial action unit; facial component; landmarks tracking; multilevel facial activity; muscular movement; simultaneous facial activity tracking; unified stochastic framework; Face; Face recognition; Facial features; Gold; Mouth; Shape; Tracking;