Looking at People: Sensing for Ubiquitous and Wearable Computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Special Section on Video Surveillance
IEEE Transactions on Pattern Analysis and Machine Intelligence
W4: Real-Time Surveillance of People and Their Activities
IEEE Transactions on Pattern Analysis and Machine Intelligence
Towards semantically meaningful feature spaces for the characterization of video content
ICIP '97 Proceedings of the 1997 International Conference on Image Processing (ICIP '97) 3-Volume Set-Volume 1 - Volume 1
A semantic event-detection approach and its application to detecting hunts in wildlife video
IEEE Transactions on Circuits and Systems for Video Technology
Fast and reliable structure-oriented video noise estimation
IEEE Transactions on Circuits and Systems for Video Technology
A proposal for local and global human activities identification
AMDO'10 Proceedings of the 6th international conference on Articulated motion and deformable objects
A comprehensive study of visual event computing
Multimedia Tools and Applications
Human activity monitoring by local and global finite state machines
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
The purpose of this paper is to investigate a real-time system to detect context-independent events in video shots. We test the system in video surveillance environments with a fixed camera. We assume that objects have been segmented (not necessarily perfectly) and reason with their low-level features, such as shape, and mid-level features, such as trajectory, to infer events related to moving objects. Our goal is to detect generic events, i.e., events that are independent of the context of where or how they occur. Events are detected based on a formal definition of these and on approximate but efficient world models. This is done by continually monitoring changes and behavior of features of video objects. When certain conditions are met, events are detected. We classify events into four types: primitive, action, interaction, and composite. Our system includes three interacting video processing layers: enhancement to estimate and reduce additive noise, analysis to segment and track video objects, and interpretation to detect context-independent events. The contributions in this paper are the interpretation of spatio-temporal object features to detect context-independent events in real time, the adaptation to noise, and the correction and compensation of low-level processing errors at higher layers where more information is available. The effectiveness and real-time response of our system are demonstrated by extensive experimentation on indoor and outdoor video shots in the presence of multi-object occlusion, different noise levels, and coding artifacts.