Human Activity Recognition Using the 4D Spatiotemporal Shape Context Descriptor

  • Authors:
  • Natasha Kholgade;Andreas Savakis

  • Affiliations:
  • Department of Computer Engineering, Rochester Institute of Technology, Rochester 14623;Department of Computer Engineering, Rochester Institute of Technology, Rochester 14623

  • Venue:
  • ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a four-dimensional spatiotemporal shape context descriptor is introduced and used for human activity recognition in video. The spatiotemporal shape context is computed on silhouette points by binning the magnitude and direction of motion at every point with respect to given vertex, in addition to the binning of radial displacement and angular offset associated with the standard 2D shape context. Human activity recognition at each video frame is performed by matching the spatiotemporal shape context to a library of known activities via k-nearest neighbor classification. Activity recognition in a video sequence is based on majority classification of the video frame results. Experiments on the Weizmann set of ten activities indicate that the proposed shape context achieves better recognition of activities than the original 2D shape context, with overall recognition rates of 90% obtained for individual frames and 97.9% for video sequences.