• DocumentCode
    3605451
  • Title

    Robust Face Recognition via Multimodal Deep Face Representation

  • Author

    Changxing Ding ; Dacheng Tao

  • Author_Institution
    Centre for Quantum Comput. & Intell. Syst., Univ. of Technol., Sydney, NSW, Australia
  • Volume
    17
  • Issue
    11
  • fYear
    2015
  • Firstpage
    2049
  • Lastpage
    2058
  • Abstract
    Face images appearing in multimedia applications, e.g., social networks and digital entertainment, usually exhibit dramatic pose, illumination, and expression variations, resulting in considerable performance degradation for traditional face recognition algorithms. This paper proposes a comprehensive deep learning framework to jointly learn face representation using multimodal information. The proposed deep learning structure is composed of a set of elaborately designed convolutional neural networks (CNNs) and a three-layer stacked auto-encoder (SAE). The set of CNNs extracts complementary facial features from multimodal data. Then, the extracted features are concatenated to form a high-dimensional feature vector, whose dimension is compressed by SAE. All of the CNNs are trained using a subset of 9,000 subjects from the publicly available CASIA-WebFace database, which ensures the reproducibility of this work. Using the proposed single CNN architecture and limited training data, 98.43% verification rate is achieved on the LFW database. Benefitting from the complementary information contained in multimodal data, our small ensemble system achieves higher than 99.0% recognition rate on LFW using publicly available training set.
  • Keywords
    face recognition; feature extraction; image representation; learning (artificial intelligence); neural nets; CASIA-WebFace database; CNN; LFW database; SAE; complementary facial feature extraction; comprehensive deep learning framework; convolutional neural networks; face images; face representation; high-dimensional feature vector; multimodal deep face representation; multimodal information; robust face recognition; three-layer stacked auto-encoder; Databases; Face; Face recognition; Feature extraction; Multimedia communication; Social network services; Training; Convolutional neural networks (CNNs); deep learning; face recognition; multimodal system;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2015.2477042
  • Filename
    7243358