Title :
An effective fusion scheme of spatio-temporal features for human action recognition in RGB-D video
Author :
Tran, Quang D. ; Ly, Ngoc Q.
Author_Institution :
Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City, Vietnam
Abstract :
We investigate the problem of human action recognition by studying the effects of fusing feature streams retrieved from color and depth sequences. Our main contribution is two-fold: First, we present the so-called 3DS-HONV descriptor which is a spatio-temporal extension of Histogram of Oriented Normal vector (HONV), specifically designed for capturing the joint shape-motion vision cues from depth sequences; on the other hand, an effective RGB-D features fusion scheme, which exploits information from both color and depth channels, is developed to extract expressive representations for action recognition in real scenarios. As a result, despite its simplicity, our 3DS-HONV descriptor performs surprisingly well, and achieves the state-of-the-art performance on MSRAction3D dataset, which is 88.89% in overall accuracy. Further experiments demonstrate that our latter feature fusion scheme also generalizes well and achieves good results on the one-shot-learning ChaLearn Gesture Data (CGD2011).
Keywords :
feature extraction; gesture recognition; image colour analysis; image representation; image sequences; spatiotemporal phenomena; 3DS-HONV descriptor; CGD2011; ChaLearn Gesture Data; MSRAction3D dataset; RGB-D video; color channels; color sequences; depth channels; depth sequences; expressive representations; feature streams; fusion scheme; human action recognition; oriented normal vector histogram; shape-motion vision; spatio-temporal extension; spatio-temporal features; Encoding; Feature extraction; Histograms; Joints; Quantization (signal); Three-dimensional displays; Vectors;
Conference_Titel :
Control, Automation and Information Sciences (ICCAIS), 2013 International Conference on
Conference_Location :
Nha Trang
Print_ISBN :
978-1-4799-0569-0
DOI :
10.1109/ICCAIS.2013.6720562