• DocumentCode
    2287675
  • Title

    Action detection in complex scenes with spatial and temporal ambiguities

  • Author

    Hu, Yuxiao ; Cao, Liangliang ; Lv, Fengjun ; Yan, Shuicheng ; Gong, Yihong ; Huang, Thomas S.

  • Author_Institution
    Dept. of ECE, UIUC, Singapore, Singapore
  • fYear
    2009
  • fDate
    Sept. 29 2009-Oct. 2 2009
  • Firstpage
    128
  • Lastpage
    135
  • Abstract
    In this paper, we investigate the detection of semantic human actions in complex scenes. Unlike conventional action recognition in well-controlled environments, action detection in complex scenes suffers from cluttered backgrounds, heavy crowds, occluded bodies, and spatial-temporal boundary ambiguities caused by imperfect human detection and tracking. Conventional algorithms are likely to fail with such spatial-temporal ambiguities. In this work, the candidate regions of an action are treated as a bag of instances. Then a novel multiple-instance learning framework, named SMILE-SVM (Simulated annealing Multiple Instance LEarning Support Vector Machines), is presented for learning human action detector based on imprecise action locations. SMILE-SVM is extensively evaluated with satisfactory performances on two tasks: (1) human action detection on a public video action database with cluttered backgrounds, and (2) a real world problem of detecting whether the customers in a shopping mall show an intention to purchase the merchandise on shelf (even if they didn´t buy it eventually). In addition, the complementary nature of motion and appearance features in action detection are also validated, demonstrating a boosted performance in our experiments.
  • Keywords
    object detection; semantic networks; simulated annealing; support vector machines; SMILE-SVM; cluttered backgrounds; human action detection; occluded bodies; real world problem; semantic human actions detection; simulated annealing multiple instance learning support vector machines; spatial-temporal boundary ambiguities; video action database; Computer vision; Detectors; Humans; Layout; Machine learning; Merchandise; Performance evaluation; Simulated annealing; Spatial databases; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision, 2009 IEEE 12th International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1550-5499
  • Print_ISBN
    978-1-4244-4420-5
  • Electronic_ISBN
    1550-5499
  • Type

    conf

  • DOI
    10.1109/ICCV.2009.5459153
  • Filename
    5459153