Keep it simple and sparse: real-time action recognition

  • Authors:
  • Sean Ryan Fanello;Ilaria Gori;Giorgio Metta;Francesca Odone

  • Affiliations:
  • iCub Facility, Istituto Italiano di Tecnologia, Genova, Italia;iCub Facility, Istituto Italiano di Tecnologia, Genova, Italia;iCub Facility, Istituto Italiano di Tecnologia, Genova, Italia;Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università degli Studi di Genova, Genova, Italia

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sparsity has been showed to be one of the most important properties for visual recognition purposes. In this paper we show that sparse representation plays a fundamental role in achieving one-shot learning and real-time recognition of actions. We start off from RGBD images, combine motion and appearance cues and extract state-of-the-art features in a computationally efficient way. The proposed method relies on descriptors based on 3D Histograms of Scene Flow (3DHOFs) and Global Histograms of Oriented Gradient (GHOGs); adaptive sparse coding is applied to capture high-level patterns from data. We then propose a simultaneous on-line video segmentation and recognition of actions using linear SVMs. The main contribution of the paper is an effective real-time system for one-shot action modeling and recognition; the paper highlights the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on three different data sets: a benchmark data set for one-shot action learning (the ChaLearn Gesture Data Set), an in-house data set acquired by a Kinect sensor including complex actions and gestures differing by small details, and a data set created for human-robot interaction purposes. Finally we demonstrate that our system is effective also in a human-robot interaction setting and propose a memory game, "All Gestures You Can", to be played against a humanoid robot.