Video scene analysis of interactions between humans and vehicles using event context

  • Authors:
  • M. S. Ryoo;Jong Taek Lee;J. K. Aggarwal

  • Affiliations:
  • ETRI, Daejeon, Korea;University of Texas at Austin, Austin, TX;University of Texas at Austin, Austin, TX

  • Venue:
  • Proceedings of the ACM International Conference on Image and Video Retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present a methodology to estimate a detailed state of a video scene involving multiple humans and vehicles. In order to automatically annotate and retrieve videos containing activities of humans and vehicles, the system must correctly identify their trajectories and relationships even in a complex dynamic environment. Our methodology constructs various joint 3-D models describing possible configurations of humans and vehicles in each image frame and performs maximum-a-posteriori tracking to obtain a sequence of scene states that matches the video. Reliable and view-independent scene state analysis is performed by taking advantage of event context. We focus on the fact that events occurring in a video must contextually coincide with scene states of humans and vehicles. Our experimental results verify that our system using event context is able to analyze and track 3-D scene states of complex human-vehicle interactions more reliably and accurately than previous systems.