Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners

  • Authors:
  • Andrew Gilbert;John Illingworth;Richard Bowden

  • Affiliations:
  • CVSSP, University of Surrey, Guildford, England GU2 7XH;CVSSP, University of Surrey, Guildford, England GU2 7XH;CVSSP, University of Surrey, Guildford, England GU2 7XH

  • Venue:
  • ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of sparse invariant features to recognise classes of actions or objects has become common in the literature. However, features are often "engineered" to be both sparse and invariant to transformation and it is assumed that they provide the greatest discriminative information. To tackle activity recognition, we propose learning compound features that are assembled from simple 2D corners in both space and time. Each corner is encoded in relation to its neighbours and from an over complete set (in excess of 1 million possible features), compound features are extracted using data mining. The final classifier, consisting of sets of compound features, can then be applied to recognise and localise an activity in real-time while providing superior performance to other state-of-the-art approaches (including those based upon sparse feature detectors). Furthermore, the approach requires only weak supervision in the form of class labels for each training sequence. No ground truth position or temporal alignment is required during training.