Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videos

  • Authors:
  • G. J. Burghouts;K. Schutte;H. Bouma;R. J. Hollander

  • Affiliations:
  • Intelligent Imaging, TNO, The Hague, The Netherlands;Intelligent Imaging, TNO, The Hague, The Netherlands;Intelligent Imaging, TNO, The Hague, The Netherlands;Intelligent Imaging, TNO, The Hague, The Netherlands

  • Venue:
  • Machine Vision and Applications
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a system is presented that can detect 48 human actions in realistic videos, ranging from simple actions such as `walk' to complex actions such as `exchange'. We propose a method that gives a major contribution in performance. The reason for this major improvement is related to a different approach on three themes: sample selection, two-stage classification, and the combination of multiple features. First, we show that the sampling can be improved by smart selection of the negatives. Second, we show that exploiting all 48 actions' posteriors by two-stage classification greatly improves its detection. Third, we show how low-level motion and high-level object features should be combined. These three yield a performance improvement of a factor 2.37 for human action detection in the visint.org test set of 1,294 realistic videos. In addition, we demonstrate that selective sampling and the two-stage setup improve on standard bag-of-feature methods on the UT-interaction dataset, and our method outperforms state-of-the-art for the IXMAS dataset.