Computational Perception of Scene Dynamics
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
A Maximum-Likelihood Approach to Visual Event Classification
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
Coupled hidden Markov models for complex action recognition
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Real-Time Self-Calibrating Stereo Person Tracking Using 3-D Shape Estimation from Blob Features
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Understanding Human Behaviors Based on Eye-Head-Hand Coordination
BMCV '02 Proceedings of the Second International Workshop on Biologically Motivated Computer Vision
Learning temporal, relational, force-dynamic event definitions from video
Eighteenth national conference on Artificial intelligence
Reconstructing force-dynamic models from video sequences
Artificial Intelligence
Visualizing Competitive Behaviors in Multi-User Virtual Environments
VIS '04 Proceedings of the conference on Visualization '04
Generating Comics from 3D Interactive Computer Graphics
IEEE Computer Graphics and Applications
Distributed Activity Recognition with Fuzzy-Enabled Wireless Sensor Networks
DCOSS '08 Proceedings of the 4th IEEE international conference on Distributed Computing in Sensor Systems
Journal of Artificial Intelligence Research
Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic
Journal of Artificial Intelligence Research
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
We address the problem of visually detecting causal events and fitting them together into a coherent story of the action witnessed by the camera. We show that this can be done by reasoning about the motions and collisions of surfaces, using high-level causal constraints derived from psychological studies of infant visual behavior. These constraints are naive forms of basic physical laws governing substantiality, contiguity, momentum, and acceleration. We describe two implementations. One system parses instructional videos, extracting plans of action and key frames suitable for storyboarding. Since learning will play a role in making such systems robust, we introduce a new framework for higher-order hidden Markov models and demonstrate its use in a second system that segments stereo video into actions in near real-time. Rather than attempt accurate low-level vision, both systems use high-level causal analysis to integrate fast but sloppy pixel-based representations over time. The output is suitable for summary, indexing, and automated editing.