• DocumentCode
    29385
  • Title

    Incremental Learning of 3D-DCT Compact Representations for Robust Visual Tracking

  • Author

    Xi Li ; Dick, Anthony ; Chunhua Shen ; van den Hengel, A. ; Hanzi Wang

  • Author_Institution
    Australian Centre for Visual Technol., Univ. of Adelaide, Adelaide, SA, Australia
  • Volume
    35
  • Issue
    4
  • fYear
    2013
  • fDate
    Apr-13
  • Firstpage
    863
  • Lastpage
    881
  • Abstract
    Visual tracking usually requires an object appearance model that is robust to changing illumination, pose, and other factors encountered in video. Many recent trackers utilize appearance samples in previous frames to form the bases upon which the object appearance model is built. This approach has the following limitations: 1) The bases are data driven, so they can be easily corrupted, and 2) it is difficult to robustly update the bases in challenging situations. In this paper, we construct an appearance model using the 3D discrete cosine transform (3D-DCT). The 3D-DCT is based on a set of cosine basis functions which are determined by the dimensions of the 3D signal and thus independent of the input video data. In addition, the 3D-DCT can generate a compact energy spectrum whose high-frequency coefficients are sparse if the appearance samples are similar. By discarding these high-frequency coefficients, we simultaneously obtain a compact 3D-DCT-based object representation and a signal reconstruction-based similarity measure (reflecting the information loss from signal reconstruction). To efficiently update the object representation, we propose an incremental 3D-DCT algorithm which decomposes the 3D-DCT into successive operations of the 2D discrete cosine transform (2D-DCT) and 1D discrete cosine transform (1D-DCT) on the input video data. As a result, the incremental 3D-DCT algorithm only needs to compute the 2D-DCT for newly added frames as well as the 1D-DCT along the third dimension, which significantly reduces the computational complexity. Based on this incremental 3D-DCT algorithm, we design a discriminative criterion to evaluate the likelihood of a test sample belonging to the foreground object. We then embed the discriminative criterion into a particle filtering framework for object state inference over time. Experimental results demonstrate the effectiveness and robustness of the proposed tracker.
  • Keywords
    computational complexity; discrete cosine transforms; image representation; learning (artificial intelligence); object tracking; particle filtering (numerical methods); signal reconstruction; video signal processing; 1D discrete cosine transform; 1D-DCT; 2D discrete cosine transform; 2D-DCT; 3D discrete cosine transform; 3D signal; 3D-DCT compact representations; appearance samples; changing illumination; compact 3D-DCT-based object representation; compact energy spectrum; computational complexity; cosine basis functions; discriminative criterion; high-frequency coefficients; incremental 3D-DCT algorithm; incremental learning; information loss; input video data; object appearance model; object state inference; particle filtering framework; robust visual tracking; signal reconstruction-based similarity measure; Adaptation models; Algorithm design and analysis; Discrete cosine transforms; Image reconstruction; Loss measurement; Robustness; Visualization; Visual tracking; appearance model; compact representation; discrete cosine transform (DCT); incremental learning; template matching; Algorithms; Artificial Intelligence; Face; Humans; Imaging, Three-Dimensional; Models, Theoretical; Video Recording;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2012.166
  • Filename
    6257395