Robust sequence alignment for actor-object interaction recognition: Discovering actor-object states

Authors:
Roman Filipovych;Eraldo Ribeiro
Affiliations:
Computer Vision and Bio-inspired Computing Laboratory, Florida Institute of Technology, Department of Computer Sciences, Melbourne, FL, USA;Computer Vision and Bio-inspired Computing Laboratory, Florida Institute of Technology, Department of Computer Sciences, Melbourne, FL, USA
Venue:
Computer Vision and Image Understanding
Year:
2011

Citing 25
Cited 1

Function-based generic recognition for multiple object categories

CVGIP: Image Understanding
Recognition by functional parts

Computer Vision and Image Understanding - Special issue of funtion-based vision
Finding patterns in time series: a dynamic programming approach

Advances in knowledge discovery and data mining
Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Parametric Hidden Markov Models for Gesture Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
Learning and Recognizing Human Dynamics in Video Sequences

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Indexing multi-dimensional time-series with support for multiple distance measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Complex Human Activity Recognition for Monitoring Wide Outdoor Environments

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Semantic-level Understanding of Human Actions and Interactions using Event Hierarchy

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 1 - Volume 01
Robust and Efficient Foreground Analysis for Real-Time Video Surveillance

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Combining Image Regions and Human Activity for Indirect Object Recognition in Indoor Wide-Angle Views

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Detecting Irregularities in Images and in Video

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Matching Shape Sequences in Video with Applications in Human Movement Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Classification of Task-Oriented Actions Based on Stochastic Context-Free Grammar

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
Recognition of Composite Human Activities through Context-Free Grammar Based Representation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Learning function-based object classification from 3D imagery

Computer Vision and Image Understanding
Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Human action recognition using distribution of oriented rectangular patches

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Human motion recognition using Isomap and dynamic time warping

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Inferring 3D body pose from silhouettes using activity manifold learning

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Finding motion primitives in human body gestures

GW'05 Proceedings of the 6th international conference on Gesture in Human-Computer Interaction and Simulation

Real-Time exact graph matching with application in human action recognition

HBU'12 Proceedings of the Third international conference on Human Behavior Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we address the problem of recognizing atomic human-object interactions from videos. Our method is based on the observation that, at the moment of physical contact with the object, both the motion and appearance (i.e., shape) of the interacting person are constrained by the target object. We introduce the concept of actor-object states as the instantaneous configuration of actor and object that usually corresponds to the moment of physical contact. We argue that the information content in frames belonging to the actor-object states is descriptive of the specific interaction. We use the actor-object state concept to propose an approach in which human-object interactions are represented by a combination of image patches and velocity information extracted along tracked body-point trajectories. However, determining the set of video frames corresponding to actor-object states is challenging as, before and after physical contact, human motion and appearance may vary significantly for the same interaction type. We address this issue by means of a robust sequence-matching algorithm that discovers actor-object states by matching pairs of misaligned sequences of features. We then show how these discovered actor-object states can be used for the recognition of basic interactions with objects. Finally, we evaluate the proposed concept on classification tasks performed on a new dataset of atomic human-object interactions.