Attribute learning for understanding unstructured social activity

  • Authors:
  • Yanwei Fu;Timothy M. Hospedales;Tao Xiang;Shaogang Gong

  • Affiliations:
  • School of EECS, Queen Mary University of London, UK;School of EECS, Queen Mary University of London, UK;School of EECS, Queen Mary University of London, UK;School of EECS, Queen Mary University of London, UK

  • Venue:
  • ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rapid development of social video sharing platforms has created a huge demand for automatic video classification and annotation techniques, in particular for videos containing social activities of a group of people (e.g. YouTube video of a wedding reception). Recently, attribute learning has emerged as a promising paradigm for transferring learning to sparsely labelled classes in object or single-object short action classification. In contrast to existing work, this paper for the first time, tackles the problem of attribute learning for understanding group social activities with sparse labels. This problem is more challenging because of the complex multi-object nature of social activities, and the unstructured nature of the activity context. To solve this problem, we (1) contribute an unstructured social activity attribute (USAA) dataset with both visual and audio attributes, (2) introduce the concept of semi-latent attribute space and (3) propose a novel model for learning the latent attributes which alleviate the dependence of existing models on exact and exhaustive manual specification of the attribute-space. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multi-media sparse data learning tasks including: multi-task learning, N-shot transfer learning, learning with label noise and importantly zero-shot learning.