Registration of Translated and Rotated Images Using Finite Fourier Transforms
IEEE Transactions on Pattern Analysis and Machine Intelligence
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
A Bayesian Computer Vision System for Modeling Human Interactions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Motion Segmentation and Tracking Using Normalized Cuts
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Large-Scale Event Detection Using Semi-Hidden Markov Models
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Spectral Grouping Using the Nyström Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
International Journal of Computer Vision
Hierarchical Bag of Paths for Kernel Based Shape Classification
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Stable and Efficient Gaussian Process Calculations
The Journal of Machine Learning Research
Clustering Point Trajectories with Various Life-Spans
CVMP '09 Proceedings of the 2009 Conference for Visual Media Production
Two-frame motion estimation based on polynomial expansion
SCIA'03 Proceedings of the 13th Scandinavian conference on Image analysis
Proceedings of the 19th international conference on World wide web
Object Detection with Discriminatively Trained Part-Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object, scene and actions: combining multiple features for human action recognition
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Representing pairwise spatial and temporal relations for action recognition
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Modeling temporal structure of decomposable motion segments for activity classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Object segmentation by long term analysis of point trajectories
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Computer Vision: Algorithms and Applications
Computer Vision: Algorithms and Applications
Action Recognition Using Mined Hierarchical Compound Features
IEEE Transactions on Pattern Analysis and Machine Intelligence
Contour Detection and Hierarchical Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hidden Part Models for Human Action Recognition: Probabilistic versus Max Margin
IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Human detection using oriented histograms of flow and appearance
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Actom sequence models for efficient action detection
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Track to the future: Spatio-temporal video segmentation with long-range motion cues
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Action bank: A high-level representation of activity in video
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Discovering discriminative action parts from mid-level video representations
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Learning latent temporal structure for complex event detection
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
HMDB: A large video database for human motion recognition
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Learning spatiotemporal graphs of human activities
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Human activities as stochastic kronecker graphs
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Propagative hough voting for human activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Trajectory-Based modeling of human actions with motion reference points
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Motion interchange patterns for action recognition in unconstrained videos
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Space-variant descriptor sampling for action recognition based on saliency and eye movements
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VII
Explicit Modeling of Human-Object Interactions in Realistic Videos
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Complex activities, e.g. pole vaulting, are composed of a variable number of sub-events connected by complex spatio-temporal relations, whereas simple actions can be represented as sequences of short temporal parts. In this paper, we learn hierarchical representations of activity videos in an unsupervised manner. These hierarchies of mid-level motion components are data-driven decompositions specific to each video. We introduce a spectral divisive clustering algorithm to efficiently extract a hierarchy over a large number of tracklets (i.e. local trajectories). We use this structure to represent a video as an unordered binary tree. We model this tree using nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two hierarchical decompositions by relying on models of their parent---child relations. We present experimental results on four recent challenging benchmarks: the High Five dataset (Patron-Perez et al., High five: recognising human interactions in TV shows, 2010), the Olympics Sports dataset (Niebles et al., Modeling temporal structure of decomposable motion segments for activity classification, 2010), the Hollywood 2 dataset (Marszalek et al., Actions in context, 2009), and the HMDB dataset (Kuehne et al., HMDB: A large video database for human motion recognition, 2011). We show that per-video hierarchies provide additional information for activity recognition. Our approach improves over unstructured activity models, baselines using other motion decomposition algorithms, and the state of the art.