Making action recognition robust to occlusions and viewpoint changes

Authors:
Daniel Weinland;Mustafa Özuysal;Pascal Fua
Affiliations:
Deutsche Telekom Laboratories, TU Berlin, Germany;Computer Vision Laboratory, EPFL, Switzerland;Computer Vision Laboratory, EPFL, Switzerland
Venue:
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
Year:
2010

Citing 11
Cited 18

View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Object Recognition with Features Inspired by Visual Cortex

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Learning to Recognize Activities from the Wrong View Point

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Human Activity Recognition with Metric Learning

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Cross-View Action Recognition from Temporal Self-similarities

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Using component features for face recognition

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition

Are current monocular computer vision systems for human action recognition suitable for visual surveillance applications?

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part II
Human action recognition using multiple views: a comparative perspective on recent developments

J-HGBU '11 Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding
Class consistent k-means: Application to face and action recognition

Computer Vision and Image Understanding
Intelligent multi-camera video surveillance: A review

Pattern Recognition Letters
Multi-view action recognition using local similarity random forests and sensor fusion

Pattern Recognition Letters
Kernelized temporal cut for online temporal segmentation and recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
View-Invariant action recognition using latent kernelized structural SVM

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Action recognition with exemplar based 2.5d graph matching

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Spatio-temporal video representation with locality-constrained linear coding

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
View invariant action recognition using weighted fundamental ratios

Computer Vision and Image Understanding
A cognitive assistive system for monitoring the use of home medical devices

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
A unified tree-based framework for joint action localization, recognition and segmentation

Computer Vision and Image Understanding
Synchronization of user-generated videos through trajectory correspondence and a refinement procedure

Proceedings of the 10th European Conference on Visual Media Production
Common-sense reasoning for human action recognition

Pattern Recognition Letters
Temporal segmentation and assignment of successive actions in a long-term video

Pattern Recognition Letters
Silhouette-based human action recognition using sequences of key poses

Pattern Recognition Letters
Multi-Max-Margin Support Vector Machine for multi-source human action recognition

Neurocomputing
Recognizing activities in multiple views with fusion of frame judgments

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most state-of-the-art approaches to action recognition rely on global representations either by concatenating local information in a long descriptor vector or by computing a single location independent histogram. This limits their performance in presence of occlusions and when running on multiple viewpoints. We propose a novel approach to providing robustness to both occlusions and viewpoint changes that yields significant improvements over existing techniques. At its heart is a local partitioning and hierarchical classification of the 3D Histogram of Oriented Gradients (HOG) descriptor to represent sequences of images that have been concatenated into a data volume. We achieve robustness to occlusions and viewpoint changes by combining training data from all viewpoints to train classifiers that estimate action labels independently over sets of HOG blocks. A top level classifier combines these local labels into a global action class decision.