Local velocity-adapted motion events for spatio-temporal recognition

  • Authors:
  • Ivan Laptev;Barbara Caputo;Christian Schüldt;Tony Lindeberg

  • Affiliations:
  • IRISA/INRIA, Campus de Beaulieu, 35042 Rennes, France;IDIAP, Rue de Simplon 4, P.O. Box 592, 1920 Martigny, Switzerland;Computational Vision and Active Perception Laboratory (CVAP), School of Computer Science and Communication KTH, S-100 44 Stockholm, Sweden;Computational Vision and Active Perception Laboratory (CVAP), School of Computer Science and Communication KTH, S-100 44 Stockholm, Sweden

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we address the problem of motion recognition using event-based local motion representations. We assume that similar patterns of motion contain similar events with consistent motion across image sequences. Using this assumption, we formulate the problem of motion recognition as a matching of corresponding events in image sequences. To enable the matching, we present and evaluate a set of motion descriptors that exploit the spatial and the temporal coherence of motion measurements between corresponding events in image sequences. As the motion measurements may depend on the relative motion of the camera, we also present a mechanism for local velocity adaptation of events and evaluate its influence when recognizing image sequences subjected to different camera motions. When recognizing motion patterns, we compare the performance of a nearest neighbor (NN) classifier with the performance of a support vector machine (SVM). We also compare event-based motion representations to motion representations in terms of global histograms. A systematic experimental evaluation on a large video database with human actions demonstrates that (i) local spatio-temporal image descriptors can be defined to carry important information of space-time events for subsequent recognition, and that (ii) local velocity adaptation is an important mechanism in situations when the relative motion between the camera and the interesting events in the scene is unknown. The particular advantage of event-based representations and velocity adaptation is further emphasized when recognizing human actions in unconstrained scenes with complex and non-stationary backgrounds.