• DocumentCode
    3003147
  • Title

    Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes

  • Author

    Nayak, Shriguru ; Sarkar, Santonu ; Loeding, Barbara

  • Author_Institution
    Photometria Inc., San Diego, CA, USA
  • fYear
    2009
  • fDate
    20-25 June 2009
  • Firstpage
    2583
  • Lastpage
    2590
  • Abstract
    Recognition of signs in sentences requires a training set constructed out of signs found in continuous sentences. Currently, this is done manually, which is a tedious process. In this work, we consider a framework where the modeler just provides multiple video sequences of sign language sentences, constructed to contain the vocabulary of interest. We learn the models of the recurring signs, automatically. Specifically, we automatically extract the parts of the signs that are present in most occurrences of the sign in context. These parts of the signs that is stable with respect to adjacent signs, are referred to as signemes. Each video is first transformed into a multidimensional time series representation, capturing the motion and shape aspects of the sign. We then extract signemes from multiple sentences, concurrently, using Iterated Conditional Modes (ICM). We show results by learning multiple instances of 10 different signs from a set of 136 sign language sentences. We classify the extracted signemes as correct, partially correct or incorrect depending on whether both the start and end locations are correct, only one of them is correct or both are incorrect, respectively. Out of the 136 extracted video signemes, 98 were correct, 20 were partially correct and 18 were incorrect. To demonstrate the generality of the unsupervised modeling idea, we also show the ability to automatically extract common spoken words in audio. We consider the English glosses (spoken) corresponding to the sign language sentences and extract the audio counterparts of the signs. Of the 136 such instances, we recovered 127 correct, 8 partially correct, and 1 incorrect representation of the words.
  • Keywords
    feature extraction; image classification; image sequences; modelling; video signal processing; automated sign extraction; continuous sign language sentence; iterated conditional mode; multidimensional time series representation; sign recognition; signemes classification; spoken words; training set; unsupervised; video sequences; video signemes extraction; word representation; Computer science education; Continuing education; Handicapped aids; Lakes; Motion measurement; Multidimensional systems; Photometry; Shape; Video sequences; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1063-6919
  • Print_ISBN
    978-1-4244-3992-8
  • Type

    conf

  • DOI
    10.1109/CVPR.2009.5206599
  • Filename
    5206599