• DocumentCode
    1377807
  • Title

    A Matrix-Based Approach to Unsupervised Human Action Categorization

  • Author

    Cui, Peng ; Wang, Fei ; Sun, Li-Feng ; Zhang, Jian-Wei ; Yang, Shi-Qiang

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • Volume
    14
  • Issue
    1
  • fYear
    2012
  • Firstpage
    102
  • Lastpage
    110
  • Abstract
    Human action, as the basic unit of most human-relevant video content, bridges the gap between low-level visual features and high-level semantics. Human action recognition is of great significance in the applications of human-computer interaction, intelligent video surveillance, video retrieval and search. In this paper, we propose a novel unsupervised approach to mining categories from action video sequences, which consists of two modules: action representation for video data structurization and learning model for unsupervised categorization. In action representation, a novel view of video decomposition is presented. Videos are regarded as spatially distributed dynamic pixel time series, and these dynamic pixels are first quantized into pixel prototypes. After replacing the pixel time series with their corresponding prototype labels, the video sequences are compressed into two-dimensional action matrices. In the learning model, we put these matrices together to form an multi-action tensor, and propose the joint matrix factorization method to simultaneously cluster the pixel prototypes into pixel signatures, and matrices into action classes with the consideration of the duality between pixel clustering and action clustering. The approach is tested on public and popular Weizmann, and KTH datasets, and promising results are achieved.
  • Keywords
    category theory; data mining; feature extraction; human computer interaction; image motion analysis; image sequences; learning (artificial intelligence); matrix decomposition; pattern clustering; quantisation (signal); signal classification; signal representation; source separation; tensors; time series; video signal processing; 2D action matrix; action clustering; action representation; action video sequences; category mining; dynamic pixel quantization; high-level semantics; human action recognition; human-computer interaction; human-relevant video content; intelligent video surveillance; joint matrix factorization method; learning model; low-level visual features; matrix-based approach; multiaction tensor; pixel prototype clustering; pixel signature; spatially distributed dynamic pixel time series; unsupervised human action categorization; video data structurization; video decomposition; video retrieval; video search; video sequence compression; Discrete Fourier transforms; Feature extraction; Prototypes; Semantics; Tensile stress; Time series analysis; Video sequences; Action categorization; joint matrix factorization; tensor representation; video analysis;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2011.2176110
  • Filename
    6082444