Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

Authors:
Chris Ellis;Syed Zain Masood;Marshall F. Tappen;Joseph J. Laviola, Jr.;Rahul Sukthankar
Affiliations:
Department of Computer Science, University of Central Florida, Orlando, USA 32826;Department of Computer Science, University of Central Florida, Orlando, USA 32826;Department of Computer Science, University of Central Florida, Orlando, USA 32826;Department of Computer Science, University of Central Florida, Orlando, USA 32826;Google Research, Google Inc, Mountain View, USA 94043
Venue:
International Journal of Computer Vision
Year:
2013

Citing 21
Cited 6

Fast Approximate Energy Minimization via Graph Cuts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Robust Real-Time Face Detection

International Journal of Computer Vision
Strike a Pose: Tracking People by Finding Stylized Poses

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Online decoding of Markov models under latency constraints

ICML '06 Proceedings of the 23rd international conference on Machine learning
Motion templates for automatic classification and retrieval of motion capture data

Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation
A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Motion Histogram Analysis Based Key Frame Extraction for Human Action/Activity Representation

CRV '09 Proceedings of the 2009 Canadian Conference on Computer and Robot Vision
View-Invariant Action Recognition from Point Triplets

IEEE Transactions on Pattern Analysis and Machine Intelligence
Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Minimal-latency human action recognition using reliable-inference

Image and Vision Computing
HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion

International Journal of Computer Vision
Exploring strategies and guidelines for developing full body video game interfaces

Proceedings of the Fifth International Conference on the Foundations of Digital Games
Real-time classification of dance gestures from skeleton animation

SCA '11 Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Recognition and segmentation of 3-d human action using HMM and multi-class adaboost

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Key frame-based activity representation using antieigenvalues

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part II
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Instructing people for training gestural interactive systems

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Mining actionlet ensemble for action recognition with depth cameras

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Max-margin early event detectors

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Efficient regression of general-activity human poses from depth images

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Online human gesture recognition from motion data streams

Proceedings of the 21st ACM international conference on Multimedia
Graph-based analysis of physical exercise actions

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hidden Markov Model on a unit hypersphere space for gesture trajectory recognition

Pattern Recognition Letters
Effective 3D action recognition using EigenJoints

Journal of Visual Communication and Image Representation
Max-Margin Early Event Detectors

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important aspect in designing interactive, action-based interfaces is reliably recognizing actions with minimal latency. High latency causes the system's feedback to lag behind user actions and thus significantly degrades the interactivity of the user experience. This paper presents algorithms for reducing latency when recognizing actions. We use a latency-aware learning formulation to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses. We introduce a novel (publicly released) dataset for the purpose of our experiments. Comparisons of our method against both a Bag of Words and a Conditional Random Field (CRF) classifier show improved recognition performance for both pre-segmented and online classification tasks. Additionally, we employ GentleBoost to reduce our feature set and further improve our results. We then present experiments that explore the accuracy/latency trade-off over a varying number of actions. Finally, we evaluate our algorithm on two existing datasets.