Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video

Authors:
Gal Lavee;Ehud Rivlin;Michael Rudzsky
Affiliations:
Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel;Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel;Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Year:
2009

Citing 90
Cited 25

From image sequences towards conceptual descriptions

Image and Vision Computing
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
On the visual expectations of moving objects

ECAI '92 Proceedings of the 10th European conference on Artificial intelligence
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities

Computational Linguistics
Visual surveillance in a dynamic and uncertain world

Artificial Intelligence - Special volume on computer vision
Statistical methods for speech recognition

Statistical methods for speech recognition
Learning in graphical models

Learning in graphical models
Human motion analysis: a review

Computer Vision and Image Understanding
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A framework for recognizing multi-agent action from visual evidence

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
A Bayesian Computer Vision System for Modeling Human Interactions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discovery and Segmentation of Activities in Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
An efficient context-free parsing algorithm

Communications of the ACM
The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
A framework for recognizing the simultaneous aspects of American sign language

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Learning variable-length Markov models of behavior

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Event Detection and Analysis from Video Streams

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modelling with Generalized Stochastic Petri Nets

Modelling with Generalized Stochastic Petri Nets
The theory of parsing, translation, and compiling

The theory of parsing, translation, and compiling
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Visual Event Classification via Force Dynamics

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Comparison of Feedforward (TDRBF) and Generative (TDRGBN) Network for Gesture Based Control

GW '01 Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in Human-Computer Interaction
Recognizing multitasked activities from video using stochastic context-free grammar

Eighteenth national conference on Artificial intelligence
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Human Action Detection Using PNF Propagation of Temporal Constraints

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Understanding manipulation in video

FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
Manipulative Hand Gesture Recognition Using Task Knowledge for Human Computer Interaction

FG '98 Proceedings of the 3rd. International Conference on Face & Gesture Recognition
Gesture Modeling and Recognition Using Finite State Machines

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Video Sequence Interpretation for Visual Surveillance

VS '00 Proceedings of the Third IEEE International Workshop on Visual Surveillance (VS'2000)
Representing and Recognizing Visual Dynamic Events with Support Vector Machines

ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
Layered Representations for Human Activity Recognition

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Real-time American Sign Language recognition from video using hidden Markov models

ISCV '95 Proceedings of the International Symposium on Computer Vision
Recognition of Group Activities using Dynamic Probabilistic Networks

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Large-Scale Event Detection Using Semi-Hidden Markov Models

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Representation and Recognition of Events in Surveillance Video Using Petri Nets

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 7 - Volume 07
Ontology-Driven Bayesian Networks for Dynamic Scene Understanding

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 7 - Volume 07
Video-based event recognition: activity representation and probabilistic recognition methods

Computer Vision and Image Understanding - Special issue on event detection in video
Space-Time Behavior Based Correlation

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Activity Recognition and Abnormality Detection with the Switching Hidden Semi-Markov Model

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Learning and Detecting Activities from Movement Trajectories Using the Hierarchical Hidden Markov Models

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
TemporalBoost for Event Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Conditional Random Fields for Contextual Human Motion Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
An APRIORI-based Method for Frequent Composite Event Discovery in Videos

ICVS '06 Proceedings of the Fourth IEEE International Conference on Computer Vision Systems
Beyond Tracking: Modelling Activity and Understanding Behaviour

International Journal of Computer Vision
Bayesian Classification of Task-Oriented Actions Based on Stochastic Context-Free Grammar

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
A sensory grammar for inferring behaviors in sensor networks

Proceedings of the 5th international conference on Information processing in sensor networks
Learning Temporal Sequence Model from Partially Labeled Data

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Recognition of Composite Human Activities through Context-Free Grammar Based Representation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Attribute Grammar-Based Event Recognition and Anomaly Detection

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Semantic Event Detection using Conditional Random Fields

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Statistical Analysis of Dynamic Actions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining temporal patterns of movement for video content classification

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Surveillance Event Interpretation Using Generalized Stochastic Petri Nets

WIAMIS '07 Proceedings of the Eight International Workshop on Image Analysis for Multimedia Interactive Services
Activity Recognition using Dynamic Bayesian Networks with Automatic State Selection

WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Recovering the Basic Structure of Human Activities from a Video-Based Symbol String

WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Coupled Hidden Semi Markov Models for Activity Recognition

WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Video understanding for complex activity recognition

Machine Vision and Applications
On scene interpretation with description logics

Image and Vision Computing
Petri net models for event recognition in surveillance videos

Petri net models for event recognition in surveillance videos
Learning a Knowledge Base of Ontological Concepts for High-Level Scene Interpretation

ICMLA '07 Proceedings of the Sixth International Conference on Machine Learning and Applications
Multi-thread Parsing for Recognizing Complex Events in Videos

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Latent Pose Estimator for Continuous Action Recognition

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Event Modeling and Recognition Using Markov Logic Networks

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
Hierarchical multi-channel hidden semi Markov models

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Behavior classification by eigendecomposition of periodic motions

Pattern Recognition
Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Recurrent Bayesian network for the recognition of human behaviors from video

ICVS'03 Proceedings of the 3rd international conference on Computer vision systems
Automatic video interpretation: a recognition algorithm for temporal scenarios based on pre-compiled scenario models

ICVS'03 Proceedings of the 3rd international conference on Computer vision systems
Building Petri nets from video event ontologies

ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part I
Boosting with temporal consistent learners: an application to human activity recognition

ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part I
Towards an architecture for cognitive vision using qualitative spatio-temporal representations and abduction

Spatial cognition III
The "Inverse hollywood problem": from video to scripts and storyboards via causal analysis

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Propagation networks for recognition of partially ordered sequential action

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Stochastic Petri Nets: Modelling, Stability, Simulation

Stochastic Petri Nets: Modelling, Stability, Simulation
View-invariant modeling and recognition of human actions using grammars

WDV'05/WDV'06/ICCV'05/ECCV'06 Proceedings of the 2005/2006 international conference on Dynamical vision
Multivalued default logic for identity maintenance in visual surveillance

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Inferring stochastic regular grammar with nearness information for human action recognition

ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part II
Complex activity representation and recognition by extended stochastic grammar

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
A comparison of HMMs and dynamic bayesian networks for recognizing office activities

UM'05 Proceedings of the 10th international conference on User Modeling
A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video

IEEE Transactions on Multimedia
A survey on visual surveillance of object motion and behaviors

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Modeling and recognition of hand gesture using colored Petri nets

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Machine Recognition of Human Activities: A Survey

IEEE Transactions on Circuits and Systems for Video Technology

A proposal for local and global human activities identification

AMDO'10 Proceedings of the 6th international conference on Articulated motion and deformable objects
The interaction ontology: low-level cue processing in real-time group conversations

Proceedings of the 2nd ACM international workshop on Events in multimedia
Automatic video genre categorization and event detection techniques on large-scale sports data

Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Event detection and recognition for semantic annotation of video

Multimedia Tools and Applications
The interaction ontology model: supporting the virtual director orchestrating real-time group interaction

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Unusual activity detection for video surveillance

Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Multicamera action recognition with canonical correlation analysis and discriminative sequence classification

IWINAC'11 Proceedings of the 4th international conference on Interplay between natural and artificial computation - Volume Part I
Video semantic concept detection using ontology

Proceedings of the Third International Conference on Internet Multimedia Computing and Service
A probabilistic, discriminative and distributed system for the recognition of human actions from multiple views

Neurocomputing
Trajectories based descriptor for dynamic events annotation

J-MRE '11 Proceedings of the 2011 joint ACM workshop on Modeling and representing events
Extending the bioinspired hierarchical temporal memory paradigm for sign language recognition

Neurocomputing
Fusion of single view soft k-NN classifiers for multicamera human action recognition

HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part II
Ice hockey shooting event modeling with mixture hidden Markov model

Multimedia Tools and Applications
A Generic Approach for Systematic Analysis of Sports Videos

ACM Transactions on Intelligent Systems and Technology (TIST)
The fascinate production scripting engine

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
A semantic-based probabilistic approach for real-time video event recognition

Computer Vision and Image Understanding
A distributed virtual director for an interactive event broadcast system

Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Automatic orchestration of video streams to enhance group communication

Proceedings of the 2012 international workshop on Socially-aware multimedia
Trajectory signature for action recognition in video

Proceedings of the 20th ACM international conference on Multimedia
Online activity recognition using evolving classifiers

Expert Systems with Applications: An International Journal
Rule-based high-level situation recognition from incomplete tracking data

RuleML'12 Proceedings of the 6th international conference on Rules on the Web: research and applications
Reordering video shots for event classification using bag-of-words models and string kernels

Proceedings of the 27th Conference on Image and Vision Computing New Zealand
Learning discriminative features for fast frame-based action recognition

Pattern Recognition
Searching informative concept banks for video event detection

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Combining multiple sensors for event recognition of older people

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare

Quantified Score

Hi-index	0.00

Visualization

Abstract

Understanding video events, i.e., the translation of low-level content in video sequences into high-level semantic concepts, is a research topic that has received much interest in recent years. Important applications of this paper include smart surveillance systems, semantic video database indexing, and interactive systems. This technology can be applied to several video domains including airport terminal, parking lot, traffic, subway stations, aerial surveillance, and sign language data. In this paper, we identify the two main components of the event understanding process: abstraction and event modeling. Abstraction is the process of molding the data into informative units to be used as input to the event model. Due to space restrictions, we will limit the discussion on the topic of abstraction. See the study by Lavee et al. (Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in video, Technion--Israel Inst. Technol., Haifa, Israel, Tech. Rep. CIS-2009-06, 2009) for a more complete discussion. Event modeling is devoted to describing events of interest formally and enabling recognition of these events as they occur in the video sequence. Event modeling can be further decomposed in the categories of pattern-recognition methods, state event models, and semantic event models. In this survey, we discuss this proposed taxonomy of the literature, offer a unifying terminology, and discuss popular event modeling formalisms (e.g., hidden Markov model) and their use in video event understanding using extensive examples from the literature. Finally, we consider the application domain of video event understanding in light of the proposed taxonomy, and propose future directions for research in this field.