Scale-Space and Edge Detection Using Anisotropic Diffusion
IEEE Transactions on Pattern Analysis and Machine Intelligence
Monocular perception of biological motion in Johansson displays
Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
View-Invariant Representation and Recognition of Actions
International Journal of Computer Vision
Recognizing and Tracking Human Action
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Action at a Distance
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Efficient Graph-Based Image Segmentation
International Journal of Computer Vision
A Performance Evaluation of Local Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Cross-View Action Recognition from Temporal Self-similarities
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Recognizing coordinated multi-object activities using a dynamic event ensemble model
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Human Body Articulation for Action Recognition in Video Sequences
AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Motion Segmentation in the Presence of Outlying, Incomplete, or Corrupted Trajectories
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mid-level features and spatio-temporal context for activity recognition
Pattern Recognition
Predicting human activities using spatio-temporal structure of interest points
Proceedings of the 20th ACM international conference on Multimedia
Propagative hough voting for human activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Hi-index | 0.00 |
We tackle the challenging problem of human activity recognition in realistic video sequences. Unlike local features-based methods or global template-based methods, we propose to represent a video sequence by a set of middle-level parts. A part, or component, has consistent spatial structure and consistent motion. We first segment the visual motion patterns and generate a set of middle-level components by clustering keypoints-based trajectories extracted from the video. To further exploit the interdependencies of the moving parts, we then define spatio-temporal relationships between pairwise components. The resulting descriptive middle-level components and pairwise-components thereby catch the essential motion characteristics of human activities. They also give a very compact representation of the video. We apply our framework on popular and challenging video datasets: Weizmann dataset and UT-Interaction dataset. We demonstrate experimentally that our middle-level representation combined with a χ2-SVM classifier equals to or outperforms the state-of-the-art results on these dataset.