• DocumentCode
    3468184
  • Title

    A Multi-scale Approach to Gesture Detection and Recognition

  • Author

    Neverova, Natalia ; Wolf, Christian ; Paci, Giacomo ; Sommavilla, Giacomo ; Taylor, Graham W. ; Nebout, Florian

  • Author_Institution
    LIRIS, Univ. de Lyon, Lyon, France
  • fYear
    2013
  • fDate
    2-8 Dec. 2013
  • Firstpage
    484
  • Lastpage
    491
  • Abstract
    We propose a generalized approach to human gesture recognition based on multiple data modalities such as depth video, articulated pose and speech. In our system, each gesture is decomposed into large-scale body motion and local subtle movements such as hand articulation. The idea of learning at multiple scales is also applied to the temporal dimension, such that a gesture is considered as a set of characteristic motion impulses, or dynamic poses. Each modality is first processed separately in short spatio-temporal blocks, where discriminative data-specific features are either manually extracted or learned. Finally, we employ a Recurrent Neural Network for modeling large-scale temporal dependencies, data fusion and ultimately gesture classification. Our experiments on the 2013 Challenge on Multimodal Gesture Recognition dataset have demonstrated that using multiple modalities at several spatial and temporal scales leads to a significant increase in performance allowing the model to compensate for errors of individual classifiers as well as noise in the separate channels.
  • Keywords
    gesture recognition; image classification; image fusion; image motion analysis; recurrent neural nets; data fusion; gesture classification; human gesture detection; human gesture recognition; large-scale body motion; local subtle movements; multimodal gesture recognition dataset; multiple data modalities; multiscale approach; recurrent neural network; Context; Data models; Feature extraction; Gesture recognition; Hidden Markov models; Joints; Vectors; action recognition; convolutional neural networks; gesture recognition; multimodal systems; recurrent neural networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • Type

    conf

  • DOI
    10.1109/ICCVW.2013.69
  • Filename
    6755936