Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes

Author

Nayak, Shriguru ; Sarkar, Santonu ; Loeding, Barbara

Author_Institution

Photometria Inc., San Diego, CA, USA

fYear

2009

fDate

20-25 June 2009

Firstpage

2583

Lastpage

2590

Abstract

Recognition of signs in sentences requires a training set constructed out of signs found in continuous sentences. Currently, this is done manually, which is a tedious process. In this work, we consider a framework where the modeler just provides multiple video sequences of sign language sentences, constructed to contain the vocabulary of interest. We learn the models of the recurring signs, automatically. Specifically, we automatically extract the parts of the signs that are present in most occurrences of the sign in context. These parts of the signs that is stable with respect to adjacent signs, are referred to as signemes. Each video is first transformed into a multidimensional time series representation, capturing the motion and shape aspects of the sign. We then extract signemes from multiple sentences, concurrently, using Iterated Conditional Modes (ICM). We show results by learning multiple instances of 10 different signs from a set of 136 sign language sentences. We classify the extracted signemes as correct, partially correct or incorrect depending on whether both the start and end locations are correct, only one of them is correct or both are incorrect, respectively. Out of the 136 extracted video signemes, 98 were correct, 20 were partially correct and 18 were incorrect. To demonstrate the generality of the unsupervised modeling idea, we also show the ability to automatically extract common spoken words in audio. We consider the English glosses (spoken) corresponding to the sign language sentences and extract the audio counterparts of the signs. Of the 136 such instances, we recovered 127 correct, 8 partially correct, and 1 incorrect representation of the words.

Keywords

feature extraction; image classification; image sequences; modelling; video signal processing; automated sign extraction; continuous sign language sentence; iterated conditional mode; multidimensional time series representation; sign recognition; signemes classification; spoken words; training set; unsupervised; video sequences; video signemes extraction; word representation; Computer science education; Continuing education; Handicapped aids; Lakes; Motion measurement; Multidimensional systems; Photometry; Shape; Video sequences; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on

Conference_Location

Miami, FL

ISSN

1063-6919

Print_ISBN

978-1-4244-3992-8

Type

conf

DOI

10.1109/CVPR.2009.5206599

Filename

5206599