On efficient use of multi-view data for activity recognition

  • Authors:
  • Tommi Määttä;Aki Härmä;Hamid Aghajan

  • Affiliations:
  • University of Technology, Eindhoven, The Netherlands;Philips Research, Eindhoven, The Netherlands;Stanford University

  • Venue:
  • Proceedings of the Fourth ACM/IEEE International Conference on Distributed Smart Cameras
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The focus of the paper is on studying five different methods to combine multi-view data from an uncalibrated smart camera network for human activity recognition. The multi-view classification scenarios studied can be divided to two categories: view selection and view fusion methods. Selection uses a single view to classify, whereas fusion merges multi-view data either on the feature- or label-level. The five methods are compared in the task of classifying human activities in three fully annotated datasets: MAS, VIHASI and HOMELAB, and a combination dataset MAS+VIHASI. Classification is performed based on image features computed from silhouette images with a binary tree structured classifier using 1D CRF for temporal modeling. The results presented in the paper show that fusion methods outperform practical selection methods. Selection methods have their advantages, but they strongly depend on how good of a selection criteria is used, and how well this criteria adapts to different environments. Furthermore, fusion of features outperforms other scenarios within more controlled settings. But the more variability exists in camera placement and characteristics of persons, the more likely improved accuracy in multi-view activity recognition can be achieved by combining candidate labels.