Multiple scale-specific representations for improved human action recognition

  • Authors:
  • Amir H. Shabani;John S. Zelek;David A. Clausi

  • Affiliations:
  • Vision and Image Processing (VIP) Lab, Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada N2L 3G1 and Intelligent Systems Lab, Department of Systems Design Engi ...;Intelligent Systems Lab, Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada N2L 3G1;Vision and Image Processing (VIP) Lab, Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada N2L 3G1

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2013

Quantified Score

Hi-index 0.10

Visualization

Abstract

Human action recognition in video is important in many computer vision applications such as automated surveillance. Human actions can be compactly encoded using a sparse set of local spatio-temporal salient features at different scales. The existing bottom-up methods construct a single dictionary of action primitives from the joint features of all scales and hence, a single action representation. This representation cannot fully exploit the complementary characteristics of the motions across different scales. To address this problem, we introduce the concept of learning multiple dictionaries of action primitives at different resolutions and consequently, multiple scale-specific representations for a given video sample. Using a decoupled fusion of multiple representations, we improved the human classification accuracy of realistic benchmark databases by about 5%, compared with the state-of-the art methods.