Human activity encoding and recognition using low-level visual features

Authors:
Zheshen Wang;Baoxin Li
Affiliations:
Department of Computer Science and Engineering, Arizona State University;Department of Computer Science and Engineering, Arizona State University
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 11
Cited 1

Fast parallel and serial approximate string matching

Journal of Algorithms
Text algorithms

Text algorithms
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Word sequence kernels

The Journal of Machine Learning Research
Optimal Cluster Preserving Embedding of Nonmetric Proximity Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Mismatch string kernels for discriminative protein classification

Bioinformatics
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
The hidden permutation model and location-based activity recognition

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Policy recognition in the abstract hidden Markov model

Journal of Artificial Intelligence Research
Location-based activity recognition using relational Markov networks

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Sequence-kernel based sparse representation for amateur video summarization

J-MRE '11 Proceedings of the 2011 joint ACM workshop on Modeling and representing events

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.