Multiobject Behavior Recognition by Event Driven Selective Attention Method

Authors:
Toshikazu Wada;Takashi Matsuyama
Affiliations:
Kyoto Univ., Kyoto, Japan;Kyoto Univ., Kyoto, Japan
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2000

Citing 11
Cited 23

An assumption-based TMS

Artificial Intelligence
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Extraction and Classification of Visual Motion Patterns for Hand Gesture Recognition

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Nonlinear PHMMs for the Interpretation of Parameterized Gesture

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Action Recognition Using Probabilistic Parsing

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Understanding manipulation in video

FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
Nonlinear manifold learning for visual speech recognition

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
Appearence Sphere: Background Model for Pan-Tilt-Zoom Camera

ICPR '96 Proceedings of the 1996 International Conference on Pattern Recognition (ICPR '96) Volume I - Volume 7270
Real-time American Sign Language recognition from video using hidden Markov models

ISCV '95 Proceedings of the International Symposium on Computer Vision
Factorial Hidden Markov Models

Factorial Hidden Markov Models

Learning Intrinsic Video Content Using Levenshtein Distance in Graph Partitioning

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Recognition of two-person interactions using a hierarchical Bayesian network

IWVS '03 First ACM SIGMM international workshop on Video surveillance
Beyond Tracking: Modelling Activity and Understanding Behaviour

International Journal of Computer Vision
Model Selection for Unsupervised Learning of Visual Context

International Journal of Computer Vision
Unsupervised scene analysis: a hidden Markov model approach

Computer Vision and Image Understanding
A System for Learning Statistical Motion Patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic visual attention model in image sequences

Image and Vision Computing
Boosted string representation and its application to video surveillance

Pattern Recognition
An efficient algorithm for attention-driven image interpretation from segments

Pattern Recognition
Hierarchical group process representation in multi-agent activity recognition

Image Communication
Unsupervised scene analysis: A hidden Markov model approach

Computer Vision and Image Understanding
Multi-agent activity recognition using observation decomposedhidden Markov models

Image and Vision Computing
Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Multi-agent activity recognition using observation decomposed hidden Markov model

ICVS'03 Proceedings of the 3rd international conference on Computer vision systems
Dimensionality reduction using a Gaussian Process Annealed Particle Filter for tracking and classification of articulated body motions

Computer Vision and Image Understanding
Toward visually inferring the underlying causal mechanism in a traffic-light-controlled crossroads

ACIVS'06 Proceedings of the 8th international conference on Advanced Concepts For Intelligent Vision Systems
Bayesian filter based behavior recognition in workflows allowing for user feedback

Computer Vision and Image Understanding
Gesture recognition using quadratic curves

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Design of a digital forensics image mining system

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Accumulative computation method for motion features extraction in active selective visual attention

WAPCV'04 Proceedings of the Second international conference on Attention and Performance in Computational Vision
Vector field analysis for multi-object behavior modeling

Image and Vision Computing
Multi-agent event recognition by preservation of spatiotemporal relationships between probabilistic models

Image and Vision Computing
A top-down event-driven approach for concurrent activity recognition

Multimedia Tools and Applications

Quantified Score

Hi-index	0.14

Visualization

Abstract

Recognizing multiple object behaviors from nonsegmented image sequences is a difficult problem because most of the motion recognition methods proposed so far share the limitation of the single-object assumption. Based on existing methods, the problem can be solved only by bottom-up image sequence segmentation followed by sequence classification. This straightforward approach totally depends on bottom-up segmentation which is easily affected by occlusions and outliers. This paper presents a completely novel approach for this task without using bottom-up segmentation. Our approach is based on assumption generation and verification, i.e., feasible assumptions about the present behaviors consistent with the input image and behavior models are dynamically generated and verified by finding their supporting evidence in input images. This can be realized by an architecture called the selective attention model, which consists of a state-dependent event detector and an event sequence analyzer. The former detects image variation (event) in a limited image region (focusing region), which is not affected by occlusions and outliers. The latter analyzes sequences of detected events and activates all feasible states representing assumptions about multiobject behaviors. In this architecture, event detection can be regarded as a verification process of generated assumptions because each focusing region is determined by the corresponding assumption. This architecture is sound since all feasible assumptions are generated. However, these redundant assumptions imply ambiguity of the recognition result. Hence, we further extend the system by introducing 1) colored-token propagation to discriminate different objects in state space and 2) integration of multiviewpoint image sequences to disambiguate the single-view recognition results. Extensive experiments of human behavior recognition in real world environments demonstrate the soundness and robustness of our architecture.