Title :
Prosody based co-analysis for continuous recognition of coverbal gestures
Author :
Kettebekov, Sanshzar ; Yeasin, Mohammed ; Sharma, Rajeev
Author_Institution :
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
Abstract :
Although recognition of natural speech and gestures have been studied extensively, previous attempts at combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were co-analyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating small hand movements, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected front the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.
Keywords :
gesture recognition; speech recognition; speech-based user interfaces; Bayesian formulation; classification; continuous coverbal gesture recognition; gesture recognition; kinematical phases; large database; multimodal integration; prior co-occurrence probability; prominent spoken segments; prosodic features; prosody based co-analysis; small hand movement detection; small hand movement disambiguation; speech articulation; speech signal; visual signal; weather channel broadcast; Automatic speech recognition; Bayesian methods; Broadcasting; Computer science; Databases; Human computer interaction; Laboratories; Natural language processing; Natural languages; Speech recognition;
Conference_Titel :
Multimodal Interfaces, 2002. Proceedings. Fourth IEEE International Conference on
Print_ISBN :
0-7695-1834-6
DOI :
10.1109/ICMI.2002.1166986