Learning Motion Categories using both Semantic and Structural Information

Author

Wong, Shu-Fai ; Kim, Tae-Kyun ; Cipolla, Roberto

Author_Institution

Univ. of Cambridge, Cambridge

fYear

2007

fDate

17-22 June 2007

Firstpage

1

Lastpage

6

Abstract

Current approaches to motion category recognition typically focus on either full spatiotemporal volume analysis (holistic approach) or analysis of the content of spatiotemporal interest points (part-based approach). Holistic approaches tend to be more sensitive to noise e.g. geometric variations, while part-based approaches usually ignore structural dependencies between parts. This paper presents a novel generative model, which extends probabilistic latent semantic analysis (pLSA), to capture both semantic (content of parts) and structural (connection between parts) information for motion category recognition. The structural information learnt can also be used to infer the location of motion for the purpose of motion detection. We test our algorithm on challenging datasets involving human actions, facial expressions and hand gestures and show its performance is better than existing unsupervised methods in both tasks of motion localisation and recognition.

Keywords

image motion analysis; image recognition; learning (artificial intelligence); facial expressions; hand gestures; human actions; motion category recognition; motion detection; motion location; probabilistic latent semantic analysis; semantic information; structural dependencies; structural information; Humans; Image motion analysis; Information analysis; Motion analysis; Motion detection; Solid modeling; Spatiotemporal phenomena; Support vector machine classification; Support vector machines; Videos;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on

Conference_Location

Minneapolis, MN

ISSN

1063-6919

Print_ISBN

1-4244-1179-3

Electronic_ISBN

1063-6919

Type

conf

DOI

10.1109/CVPR.2007.383332

Filename

4270330