• DocumentCode
    3672211
  • Title

    Matching-CNN meets KNN: Quasi-parametric human parsing

  • Author

    Si Liu;Xiaodan Liang;Luoqi Liu;Xiaohui Shen;Jianchao Yang;Changsheng Xu;Liang Lin; Xiaochun Cao;Shuicheng Yan

  • Author_Institution
    SKLOIS, IIE, Chinese Academy of Sciences, China
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    1419
  • Lastpage
    1427
  • Abstract
    Both parametric and non-parametric approaches have demonstrated encouraging performances in the human parsing task, namely segmenting a human image into several semantic regions (e.g., hat, bag, left arm, face). In this work, we aim to develop a new solution with the advantages of both methodologies, namely supervision from annotated data and the flexibility to use newly annotated (possibly uncommon) images, and present a quasi-parametric human parsing model. Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image. Given a testing image, we first retrieve its KNN images from the annotated/manually-parsed human image corpus. Then each semantic region in each KNN image is matched with confidence to the testing image using M-CNN, and the matched regions from all KNN images are further fused, followed by a superpixel smoothing procedure to obtain the ultimate human parsing result. The M-CNN differs from the classic CNN [12] in that the tailored cross image matching filters are introduced to characterize the matching between the testing image and the semantic region of a KNN image. The cross image matching filters are defined at different convolutional layers, each aiming to capture a particular range of displacements. Comprehensive evaluations over a large dataset with 7,700 annotated human images well demonstrate the significant performance gain from the quasi-parametric model over the state-of-the-arts [29, 30], for the human parsing task.
  • Keywords
    "Semantics","Testing","Impedance matching","Feature extraction","Neural networks","Pipelines"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
  • Electronic_ISBN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2015.7298748
  • Filename
    7298748