Implementation and comparison of three architectures for gesture recognition

Author

Corradini, Andrea ; Gross, Horst-Michael

Author_Institution

Dept. of Neuroinf., Tech. Hochschule Ilmenau, Germany

Volume

6

fYear

2000

fDate

2000

Firstpage

2361

Abstract

Several systems for automatic gesture recognition have been developed using different strategies and approaches. In these systems the recognition engine is mainly based on three algorithms: dynamic pattern matching, statistical classification, and neural networks (NN). In this paper three architectures for the recognition of dynamic gestures using the above mentioned techniques or a hybrid combination of them are presented and compared. For all architectures a common preprocessor receives as input a sequence of color images, and produces as output a sequence of feature vectors of continuous parameters. The first two systems are hybrid architectures consisting of a combination of neural networks and hidden Markov models (HMM). NNs are used for the classification of single feature vectors while HMMs for the modeling of sequences of them with the aim to exploit the properties of both these tools. More precisely, in the first system a Kohonen feature map (SOM) clusters the input space. Further, each code-book is transformed into a symbol from a discrete alphabet and fed into a discrete HMM for classification. In the second approach a radial basis function (RBF) network is directly used to compute the HMM state observation probabilities. In the last system only dynamic programming techniques are employed. An input sequence of feature vectors is matched by some predefined templates by using the dynamic time warping (DTW) algorithm. Preliminary experiments with our baseline systems achieved a recognition accuracy up to 92%. All systems use input from a monocular color video camera, are user-independent but so far, they are not yet real-time

Keywords

dynamic programming; feature extraction; gesture recognition; hidden Markov models; image classification; image colour analysis; image sequences; pattern matching; radial basis function networks; self-organising feature maps; video signal processing; HMM state observation probabilities; Kohonen feature map; automatic gesture recognition; code-book; color images sequence; continuous parameters; discrete HMM; discrete alphabet; dynamic gestures; dynamic pattern matching; dynamic programming; dynamic time warping algorithm; experiments; feature vectors; hidden Markov models; hybrid architectures; input sequence; monocular color video camera; neural networks; preprocessor; radial basis function network; recognition accuracy; recognition engine; statistical classification; Color; Computer networks; Data preprocessing; Dynamic programming; Engines; Heuristic algorithms; Hidden Markov models; Neural networks; Pattern matching; Pattern recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location

Istanbul

ISSN

1520-6149

Print_ISBN

0-7803-6293-4

Type

conf

DOI

10.1109/ICASSP.2000.859315

Filename

859315