Propagative hough voting for human activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Cost-Sensitive top-down/bottom-up inference for multiscale activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Collective activity localization with contextual spatial pyramid
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Viewpoint invariant collective activity recognition with relative action context
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
International Journal of Computer Vision
A survey of video datasets for human action and activity recognition
Computer Vision and Image Understanding
Hi-index | 0.00 |
Given a video, we would like to recognize group activities, localize video parts where these activities occur, and detect actors involved in them. This advances prior work that typically focuses only on video classification. We make a number of contributions. First, we specify a new, mid-level, video feature aimed at summarizing local visual cues into bags of the right detections (BORDs). BORDs seek to identify the right people who participate in a target group activity among many noisy people detections. Second, we formulate a new, generative, chains model of group activities. Inference of the chains model identifies a subset of BORDs in the video that belong to occurrences of the activity, and organizes them in an ensemble of temporal chains. The chains extend over, and thus localize, the time intervals occupied by the activity. We formulate a new MAP inference algorithm that iterates two steps: i) Warps the chains of BORDs in space and time to their expected locations, so the transformed BORDs can better summarize local visual cues; and ii) Maximizes the posterior probability of the chains. We outperform the state of the art on benchmark UT-Human Interaction and Collective Activities datasets, under reasonable running times.