• DocumentCode
    2597568
  • Title

    Learning hierarchical representation with sparsity for RGB-D object recognition

  • Author

    Yu, Kuan-Ting ; Tseng, Shih-Huan ; Fu, Li-Chen

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan
  • fYear
    2012
  • fDate
    7-12 Oct. 2012
  • Firstpage
    3011
  • Lastpage
    3016
  • Abstract
    RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation is a promising approach for 2D image classification. By exploiting sparsity in 2D image signals, we can learn image representation instead of using hand-crafted local descriptors like SIFT or HoG. This framework inspired us to learn features from RGB-D data. Our work focuses on two goals. First, we propose a novel Hierarchical Sparse Shape Descriptor (HSSD) to form learning-based representation for 3D shapes. To achieve this, we analyze several 3D feature extraction techniques and propose a unified view of them. Then, we learn hierarchical shape representation with sparse coding, max pooling and local grouping. Second, we investigate whether RGB and depth information should be fused at lower level or higher level. Experimental results show that, first, our HSSD algorithm can learn shape dictionary and provide shape cues in addition to the 2D cues. Using the proposed HSSD algorithm achieves 84% accuracy on a household RGB-D object dataset and outperforms a widely used VFH shape feature by 13%. Second, fusing RGB-D information at lower level does not improve recognition performance.
  • Keywords
    feature extraction; image classification; image coding; image representation; learning (artificial intelligence); object recognition; 2D image classification; 3D feature extraction; HSSD; HoG; RGB-D sensor; SIFT; hand-crafted local descriptors; hierarchical representation; hierarchical shape representation; hierarchical sparse shape descriptor; learning-based feature representation; object recognition; sparse coding; Accuracy; Dictionaries; Encoding; Feature extraction; Filter banks; Histograms; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on
  • Conference_Location
    Vilamoura
  • ISSN
    2153-0858
  • Print_ISBN
    978-1-4673-1737-5
  • Type

    conf

  • DOI
    10.1109/IROS.2012.6386175
  • Filename
    6386175