Learning, detection and representation of multi-agent events in videos

  • Authors:
  • Asaad Hakeem;Mubarak Shah

  • Affiliations:
  • Computer Vision Lab, School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA;Computer Vision Lab, School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA

  • Venue:
  • Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we model multi-agent events in terms of a temporally varying sequence of sub-events, and propose a novel approach for learning, detecting and representing events in videos. The proposed approach has three main steps. First, in order to learn the event structure from training videos, we automatically encode the sub-event dependency graph, which is the learnt event model that depicts the conditional dependency between sub-events. Second, we pose the problem of event detection in novel videos as clustering the maximally correlated sub-events using normalized cuts. The principal assumption made in this work is that the events are composed of a highly correlated chain of sub-events that have high weights (association) within the cluster and relatively low weights (disassociation) between the clusters. The event detection does not require prior knowledge of the number of agents involved in an event and does not make any assumptions about the length of an event. Third, we recognize the fact that any abstract event model should extend to representations related to human understanding of events. Therefore, we propose an extension of CASE representation of natural languages that allows a plausible means of interface between users and the computer. We show results of learning, detection, and representation of events for videos in the meeting, surveillance, and railroad monitoring domains.