• DocumentCode
    724497
  • Title

    One-shot learning gesture recognition based on improved 3D SMoSIFT feature descriptor from RGB-D videos

  • Author

    Jia Lin ; Xiaogang Ruan ; Naigong Yu ; Ruoyan Wei

  • Author_Institution
    Electron. Inf. & Control Eng. Coll., Beijing Univ. of Technol., Beijing, China
  • fYear
    2015
  • fDate
    23-25 May 2015
  • Firstpage
    4911
  • Lastpage
    4916
  • Abstract
    To satisfy the distinctive feature extraction requirement of one-shot learning gesture recognition for mobile robot control, a improved three-dimensional local sparse motion scale invariant feature transform (3D SMoSIFT) feature descriptor is proposed, which fuses RGB-D videos. Firstly, gray pyramid, depth pyramid and optical flow pyramids are built as scale space for each gray frame (converted from RGB frame) and depth frame. Then interest regions are extracted according the variance of optical flow, and variance is calculated in horizontal and vertical direction. Subsequently, corners are just extracted in each interest region as interest points, and then the information of gray and depth optical flow is simultaneously used to detect robust keypoints around the motion pattern in the scale space. Finally, SIFT descriptors are calculated on 3D gradient space and 3D motion space. The improved feature descriptor has been evaluated under a bag of feature model on one-shot learning Chalearn Gesture Dataset. Experiments demonstrate that the proposed method distinctly improves the accuracy of gesture recognition. The results also show that the improved 3D SMoSIFT feature descriptor surpasses other spatiotemporal feature descriptors and is comparable to the state-of-the-art approaches.
  • Keywords
    feature extraction; gesture recognition; image colour analysis; image fusion; image motion analysis; image sequences; learning (artificial intelligence); mobile robots; robot vision; video signal processing; 3D SMoSIFT feature descriptor; 3D gradient space; 3D motion space; RGB-D video fusion; RGB-D videos; SIFT descriptors; corner extraction; depth pyramid; feature extraction; gray pyramid; interest region extraction; mobile robot control; motion pattern; one-shot learning Chalearn gesture dataset; one-shot learning gesture recognition; optical flow pyramid; robust keypoint detection; three-dimensional local sparse motion scale invariant feature transform; Conferences; Gesture Recognition; One-shot Learning; RGB-D Data; Three dimensional Sparse Motion Scale-invariant Feature Transform (3D SMoSIFT); Variance of Optical Flow;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control and Decision Conference (CCDC), 2015 27th Chinese
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4799-7016-2
  • Type

    conf

  • DOI
    10.1109/CCDC.2015.7162803
  • Filename
    7162803