Human action recognition in video by fusion of structural and spatio-temporal features

  • Authors:
  • Ehsan Zare Borzeshi;Oscar Perez Concha;Massimo Piccardi

  • Affiliations:
  • School of Computing and Communications, Faculty of Engineering and IT, University of Technology, Sydney (UTS), Sydney, Australia;Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney (UNSW), Australia;School of Computing and Communications, Faculty of Engineering and IT, University of Technology, Sydney (UTS), Sydney, Australia

  • Venue:
  • SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of human action recognition has received increasing attention in recent years for its importance in many applications. Local representations and in particular STIP descriptors have gained increasing popularity for action recognition. Yet, the main limitation of those approaches is that they do not capture the spatial relationships in the subject performing the action. This paper proposes a novel method based on the fusion of global spatial relationships provided by graph embedding and the local spatio-temporal information of STIP descriptors. Experiments on an action recognition dataset reported in the paper show that recognition accuracy can be significantly improved by combining the structural information with the spatio-temporal features.