Learning human interaction by interactive phrases

Authors:
Yu Kong;Yunde Jia;Yun Fu
Affiliations:
Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, P.R. China, Department of CSE, State University of New York, Buffalo ...;Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, P.R. China;Department of ECE and College of CIS, Northeastern University, Boston, MA
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Year:
2012

Citing 7
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A discriminative latent model of object classes and attributes

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Stochastic Representation and Recognition of High-Level Group Activities

International Journal of Computer Vision
Approximating discrete probability distributions with dependence trees

IEEE Transactions on Information Theory
Human activity prediction: Early recognition of ongoing activities from streaming videos

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a novel approach for human interaction recognition from videos. We introduce high-level descriptions called interactive phrases to express binary semantic motion relationships between interacting people. Interactive phrases naturally exploit human knowledge to describe interactions and allow us to construct a more descriptive model for recognizing human interactions. We propose a novel hierarchical model to encode interactive phrases based on the latent SVM framework where interactive phrases are treated as latent variables. The interdependencies between interactive phrases are explicitly captured in the model to deal with motion ambiguity and partial occlusion in interactions. We evaluate our method on a newly collected BIT-Interaction dataset and UT-Interaction dataset. Promising results demonstrate the effectiveness of the proposed method.