Weakly Supervised Learning of Interactions between Humans and Objects

  • Authors:
  • Alessandro Prest;Cordelia Schmid;Vittorio Ferrari

  • Affiliations:
  • ETH, Zurich, and INRIA, Grenoble;INRIA, Grenoble;ETH, Zurich

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.14

Visualization

Abstract

We introduce a weakly supervised approach for learning human actions modeled as interactions between humans and objects. Our approach is human-centric: We first localize a human in the image and then determine the object relevant for the action and its spatial relation with the human. The model is learned automatically from a set of still images annotated only with the action label. Our approach relies on a human detector to initialize the model learning. For robustness to various degrees of visibility, we build a detector that learns to combine a set of existing part detectors. Starting from humans detected in a set of images depicting the action, our approach determines the action object and its spatial relation to the human. Its final output is a probabilistic model of the human-object interaction, i.e., the spatial relation between the human and the object. We present an extensive experimental evaluation on the sports action data set from [1], the PASCAL Action 2010 data set [2], and a new human-object interaction data set.