Human activity encoding and recognition using low-level visual features

  • Authors:
  • Zheshen Wang;Baoxin Li

  • Affiliations:
  • Department of Computer Science and Engineering, Arizona State University;Department of Computer Science and Engineering, Arizona State University

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.