Title :
Object tracking with Bayesian estimation of dynamic layer representations
Author :
Tao, Hai ; Sawhney, Harpreet S. ; Kumar, Rakesh
Author_Institution :
Dept. of Comput. Eng., California Univ., Santa Cruz, CA, USA
fDate :
1/1/2002 12:00:00 AM
Abstract :
Decomposing video frames into coherent 2D motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer analysis has largely concentrated on two-frame or multi-frame batch formulations. The temporal coherency of motion layers and the domain constraints on shapes have not been exploited. This paper introduces a complete dynamic motion layer representation in which spatial and temporal constraints on shape, motion and layer appearance are modeled and estimated in a maximum a-posteriori (MAP) framework using the generalized expectation-maximization (EM) algorithm. In order to limit the computational complexity of tracking arbitrarily shaped layer ownership, we propose a shape prior that parameterizes the representation of shape and prevents motion layers from evolving into arbitrary shapes. In this work, a Gaussian shape prior is chosen to specifically develop a near-real-time tracker for vehicle tracking in aerial videos. However, the general idea of using a parametric shape representation as part of the state of a tracker is a powerful one that can be extended to other domains as well. Based on the dynamic layer representation, an iterative algorithm is developed for continuous object tracking over time. The proposed method has been successfully applied in an airborne vehicle tracking system. Its performance is compared with that of a correlation-based tracker and a motion change-based tracker to demonstrate the advantages of the new method. Examples of tracking when the backgrounds are cluttered and the vehicles undergo various rigid motions and complex interactions such as passing, turning, and stop-and-go demonstrate the strength of the complete dynamic layer representation
Keywords :
Bayes methods; active vision; computational complexity; image representation; iterative methods; maximum likelihood estimation; motion estimation; optimisation; real-time systems; surveillance; tracking; vehicles; video signal processing; Bayesian estimation; Gaussian shape prior; aerial video surveillance; aerial videos; airborne vehicle tracking system; arbitrarily shaped layer ownership; cluttered backgrounds; coherent 2D motion layers; complex interactions; computational complexity; continuous object tracking; correlation-based tracker; dynamic layer representations; generalized expectation-maximization algorithm; intermediate description; iterative algorithm; layer appearance; maximum a-posteriori estimation; motion change-based tracker; motion layer analysis; multi-frame batch formulations; near-real-time tracker; parameterization; parametric shape representation; performance; rigid motions; shape domain constraints; spatial constraints; sprite-based video compression; temporal coherency; temporal constraints; vehicle passing; vehicle stop-and-go; vehicle turning; video frame decomposition; video insertion; video representation; video summarization; video visualization; Bayesian methods; Maximum a posteriori estimation; Motion analysis; Motion estimation; Shape; Tracking; Vehicle dynamics; Vehicles; Video compression; Visualization;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on