• DocumentCode
    729771
  • Title

    Beyond Bag-of-Words: Fast video classification with Fisher Kernel Vector of Locally Aggregated Descriptors

  • Author

    Mironica, Ionut ; Duta, Ionut ; Ionescu, Bogdan ; Sebe, Nicu

  • Author_Institution
    LAPI, Univ. Politeh. of Bucharest, Bucharest, Romania
  • fYear
    2015
  • fDate
    June 29 2015-July 3 2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In this paper we introduce a new video description framework that replaces traditional Bag-of-Words with a combination of Fisher Kernels (FK) and Vector of Locally Aggregated Descriptors (VLAD). The main contributions are: (i) a fast algorithm to densely extract global frame features, easier and faster to compute than spatio-temporal local features; (ii) replacing the traditional k-means based vocabulary with a Random Forest approach that allows significant speedup; (iii) use of a modified VLAD and FK representation to replace the classic Bag-of-Words and obtaining better performance. We show that our framework is highly general and is not dependent on a particular type of descriptor. It achieves state-of-the-art results in several classification scenarios.
  • Keywords
    feature extraction; image classification; video signal processing; FK; Fisher kernel vector; VLAD; bag-of-words; classification scenarios; fast video classification; global frame feature extraction; k-means based vocabulary; locally aggregated descriptors; random forest approach; spatio-temporal local features; vector of locally aggregated descriptors; Accuracy; Kernel; Standards; Support vector machines; Training; Vegetation; Visualization; Fisher Kernel Vector of Locally Aggregated Descriptor; Random Forests; video classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo (ICME), 2015 IEEE International Conference on
  • Conference_Location
    Turin
  • Type

    conf

  • DOI
    10.1109/ICME.2015.7177489
  • Filename
    7177489