• DocumentCode
    59673
  • Title

    Semantic Pyramids for Gender and Action Recognition

  • Author

    Khan, Fahad Shahbaz ; van de Weijer, Joost ; Anwer, Rao Muhammad ; Felsberg, Michael ; Gatta, Carlo

  • Author_Institution
    Dept. of Electr. Eng., Linkoping Univ., Linkoping, Sweden
  • Volume
    23
  • Issue
    8
  • fYear
    2014
  • fDate
    Aug. 2014
  • Firstpage
    3633
  • Lastpage
    3645
  • Abstract
    Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.
  • Keywords
    computer vision; face recognition; feature extraction; gesture recognition; image classification; image representation; action recognition; computer vision; face detectors; feature extraction; gender recognition; person description; pyramid representation; semantic information extraction; semantic pyramid approach; Computer vision; Detectors; Face; Face recognition; Feature extraction; Image recognition; Semantics; Gender recognition; action recognition; bag-of-words; pyramid representation;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2014.2331759
  • Filename
    6838985