Video scene analysis of interactions between humans and vehicles using event context

Authors:
M. S. Ryoo;Jong Taek Lee;J. K. Aggarwal
Affiliations:
ETRI, Daejeon, Korea;University of Texas at Austin, Austin, TX;University of Texas at Austin, Austin, TX
Venue:
Proceedings of the ACM International Conference on Image and Video Retrieval
Year:
2010

Citing 10
Cited 0

An Efficient Implementation of Reid's Multiple Hypothesis Tracking Algorithm and Its Evaluation for the Purpose of Visual Tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence
W4: Real-Time Surveillance of People and Their Activities

IEEE Transactions on Pattern Analysis and Machine Intelligence
Video Surveillance of Interactions

VS '99 Proceedings of the Second IEEE Workshop on Visual Surveillance
Tracking Multiple Humans in Complex Situations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Attribute Grammar-Based Event Recognition and Anomaly Detection

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Object tracking: A survey

ACM Computing Surveys (CSUR)
Analysis and query of person-vehicle interactions in homography domain

Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
Segmentation and Tracking of Multiple Humans in Crowded Environments

IEEE Transactions on Pattern Analysis and Machine Intelligence
Event Modeling and Recognition Using Markov Logic Networks

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
View independent recognition of human-vehicle interactions using 3-D models

WMVC'09 Proceedings of the 2009 international conference on Motion and video computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

We present a methodology to estimate a detailed state of a video scene involving multiple humans and vehicles. In order to automatically annotate and retrieve videos containing activities of humans and vehicles, the system must correctly identify their trajectories and relationships even in a complex dynamic environment. Our methodology constructs various joint 3-D models describing possible configurations of humans and vehicles in each image frame and performs maximum-a-posteriori tracking to obtain a sequence of scene states that matches the video. Reliable and view-independent scene state analysis is performed by taking advantage of event context. We focus on the fact that events occurring in a video must contextually coincide with scene states of humans and vehicles. Our experimental results verify that our system using event context is able to analyze and track 3-D scene states of complex human-vehicle interactions more reliably and accurately than previous systems.