• DocumentCode
    42247
  • Title

    Coding Visual Features Extracted From Video Sequences

  • Author

    Baroffio, Luca ; Cesana, Matteo ; Redondi, Alessandro ; Tagliasacchi, Marco ; Tubaro, Stefano

  • Author_Institution
    Dipt. di Elettron., Inf. e Bioingegneria, Politec. di Milano, Milan, Italy
  • Volume
    23
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    2262
  • Lastpage
    2276
  • Abstract
    Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
  • Keywords
    feature extraction; image representation; image retrieval; image sequences; object tracking; video coding; ATC; CTA paradigm; analyze then compress; bandwidth limited network; coding architecture; coding techniques; coding visual feature extraction; compress-then-analyze; content based retrieval; evaluation metrics; image content representation; interframe coding modes; object recognition; object tracking; rate distortion optimization; remote nodes; video frame extraction; video sequences; visual analysis; visual features; visual search; visual sensor networks; Encoding; Feature extraction; Image coding; Vectors; Video coding; Video sequences; Visualization; SIFT; SURF; Visual features; local descriptors; video coding;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2014.2312617
  • Filename
    6775275