Content-based retrieval of functional objects in video using scene context

Authors:
Sangmin Oh;Anthony Hoogs;Matthew Turek;Roderic Collins
Affiliations:
Kitware Inc.;Kitware Inc.;Kitware Inc.;Kitware Inc.
Venue:
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Year:
2010

Citing 7
Cited 2

Achieving Generalized Object Recognition through Reasoning about Association of Function to Structure

IEEE Transactions on Pattern Analysis and Machine Intelligence - Special issue on interpretation of 3-D scenes—part I
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Multi Feature Path Modeling for Video Surveillance

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Combining Image Regions and Human Activity for Indirect Object Recognition in Indoor Wide-Angle Views

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Multi-Object Tracking Through Simultaneous Long Occlusions and Split-Merge Conditions

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Unsupervised learning of functional categories in video scenes

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Functional scene element recognition for video scene analysis

WMVC'09 Proceedings of the 2009 international conference on Motion and video computing

Unsupervised learning of functional categories in video scenes

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Compact and adaptive spatial pyramids for scene recognition

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Functional object recognition in video is an emerging problem for visual surveillance and video understanding problem. By functional objects, we mean objects with specific purpose such as postman and delivery truck, which are defined more by their actions and behaviors than by appearance. In this work, we present an approach for content-based learning and recognition of the function of moving objects given video-derived tracks. In particular, we show that semantic behaviors of movers can be captured in location-independent manner by attributing them with features which encode their relations and actions w.r.t. scene contexts. By scene context, we mean local scene regions with different functionalities such as doorways and parking spots which moving objects often interact with. Based on these representations, functional models are learned from examples and novel instances are identified from unseen data afterwards. Furthermore, recognition in the presence of track fragmentation, due to imperfect tracking, is addressed by a boosting-based track linking classifier. Our experimental results highlight both promising and practical aspects of our approach.