• DocumentCode
    19247
  • Title

    Learning Multimodal Latent Attributes

  • Author

    Yanwei Fu ; Hospedales, Timothy M. ; Tao Xiang ; Shaogang Gong

  • Author_Institution
    Sch. of Electron. Eng. & Comput. Sci., Queen Mary Univ. of London, London, UK
  • Volume
    36
  • Issue
    2
  • fYear
    2014
  • fDate
    Feb. 2014
  • Firstpage
    303
  • Lastpage
    316
  • Abstract
    The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular, we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multimodal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we 1) introduce a concept of semilatent attribute space, expressing user-defined and latent attributes in a unified framework, and 2) propose a novel scalable probabilistic topic model for learning multimodal semilatent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multitask learning, learning with label noise, N-shot transfer learning, and importantly zero-shot learning.
  • Keywords
    learning (artificial intelligence); multimedia computing; object recognition; social networking (online); N-shot transfer learning; annotation techniques; automatic media classification; data sparsity; learning multimodal latent attributes; multimedia data; multimedia sparse data learning; multimodal content; multitask learning; object recognition; semantic gap; social media sharing; unified framework; Data models; Feature extraction; Media; Noise; Ontologies; Semantics; Videos; Attribute learning; latent attribute space; multitask learning; transfer learning; zero-shot learning;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2013.128
  • Filename
    6552193