• DocumentCode
    254608
  • Title

    Ground-Based Activity Recognition at Distance and behind Wall

  • Author

    Tao Wang ; Hammoud, Riad ; Zhigang Zhu

  • Author_Institution
    BAE Syst., Burlington, MA, USA
  • fYear
    2014
  • fDate
    23-28 June 2014
  • Firstpage
    231
  • Lastpage
    236
  • Abstract
    Long-range activity recognition is a challenging research problem in a surveillance area where sensors cannot be placed close to targets. Even a simple activity can be confused with other activities or not be recognized correctly if the detection in one of the sensor modalities is not certain or even unavailable. Also, the training of some real-life activities is not feasible, because it is hard to collect sufficient and accurate labeled data for varieties of free-living activities. In this paper, we use an unsupervised learning algorithm, Dirichlet process Gaussian mixture model (DPGMM), to construct a model to determine the number of classes automatically. To further represent a set of features as one event, and communicate between both audio and video, we use the DPGMM as a base and enhance it with additional aggregation, multimodal association and transition. This new model is called aggregation coupled Dirichlet process Gaussian mixture model (AC-DPGMM). We present experiments with some activities that cannot be simply distinguished using visual features only. Along with audio information, we can also recognize some activities invisible in video, such as speaking behind a wall. We compared our model with a generative clustering algorithm and the original DPGMM, and showed that we have 23.6% and 18.8% improvement in accuracy compared with manually labeled data.
  • Keywords
    Gaussian processes; audio signal processing; image recognition; mixture models; unsupervised learning; video signal processing; AC-DPGMM; Dirichlet process Gaussian mixture model; aggregation coupled Dirichlet process Gaussian mixture model; audio; feature representation; generative clustering algorithm; ground-based activity recognition; multimodal association; multimodal transition; unsupervised learning algorithm; Accuracy; Clustering algorithms; Feature extraction; Hidden Markov models; Sensors; Surveillance; Visualization; activity recognition; audio-video; long range sensing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on
  • Conference_Location
    Columbus, OH
  • Type

    conf

  • DOI
    10.1109/CVPRW.2014.43
  • Filename
    6909988