• DocumentCode
    38026
  • Title

    Real-Time Gesture Interface Based on Event-Driven Processing From Stereo Silicon Retinas

  • Author

    Jun Haeng Lee ; Delbruck, Tobi ; Pfeiffer, Michael ; Park, Paul K. J. ; Chang-Woo Shin ; Hyunsurk Ryu ; Byung Chang Kang

  • Author_Institution
    Samsung Adv. Inst. of Technol., Samsung Electron. Co. Ltd., Yongin, South Korea
  • Volume
    25
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    2250
  • Lastpage
    2263
  • Abstract
    We propose a real-time hand gesture interface based on combining a stereo pair of biologically inspired event-based dynamic vision sensor (DVS) silicon retinas with neuromorphic event-driven postprocessing. Compared with conventional vision or 3-D sensors, the use of DVSs, which output asynchronous and sparse events in response to motion, eliminates the need to extract movements from sequences of video frames, and allows significantly faster and more energy-efficient processing. In addition, the rate of input events depends on the observed movements, and thus provides an additional cue for solving the gesture spotting problem, i.e., finding the onsets and offsets of gestures. We propose a postprocessing framework based on spiking neural networks that can process the events received from the DVSs in real time, and provides an architecture for future implementation in neuromorphic hardware devices. The motion trajectories of moving hands are detected by spatiotemporally correlating the stereoscopically verged asynchronous events from the DVSs by using leaky integrate-and-fire (LIF) neurons. Adaptive thresholds of the LIF neurons achieve the segmentation of trajectories, which are then translated into discrete and finite feature vectors. The feature vectors are classified with hidden Markov models, using a separate Gaussian mixture model for spotting irrelevant transition gestures. The disparity information from stereovision is used to adapt LIF neuron parameters to achieve recognition invariant of the distance of the user to the sensor, and also helps to filter out movements in the background of the user. Exploiting the high dynamic range of DVSs, furthermore, allows gesture recognition over a 60-dB range of scene illuminance. The system achieves recognition rates well over 90% under a variety of variable conditions with static and dynamic backgrounds with naïve users.
  • Keywords
    Gaussian processes; gesture recognition; hidden Markov models; human computer interaction; neural nets; stereo image processing; vectors; video signal processing; Gaussian mixture model; LIF neuron parameter; asynchronous event; biologically inspired event-based DVS; discrete vector; dynamic vision sensor; energy-efficient processing; event-driven processing; finite feature vector; gesture spotting problem; hidden Markov model; leaky integrate-and-fire neuron; motion trajectory; neuromorphic event-driven postprocessing; neuromorphic hardware device; real-time hand gesture interface; sparse event; spiking neural network; stereo pair; stereo silicon retina; stereovision; video frames; Correlation; Hidden Markov models; Neurons; Retina; Silicon; Trajectory; Voltage control; Gesture recognition; hidden Markov model (HMM); human--computer interface; human???computer interface; neuromorphic; silicon retina; silicon retina.;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2014.2308551
  • Filename
    6774446