Track to the future: Spatio-temporal video segmentation with long-range motion cues

Authors:
J. Lezama;K. Alahari;J. Sivic;I. Laptev
Affiliations:
-;-;-;-
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 10

Multi-scale clustering of frame-to-frame correspondences for motion segmentation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Two-granularity tracking: mediating trajectory and detection graphs for tracking under occlusions

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Streaming hierarchical video segmentation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
SuperFloxels: a mid-level representation for video sequences

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
The co-attention model for tiny activity analysis

Neurocomputing
Video segmentation with superpixels

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Robust object tracking using constellation model with superpixel

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
Alpha-Flow for video matting

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
User-assisted sparse stereo-video segmentation

Proceedings of the 10th European Conference on Visual Media Production
Activity representation with motion hierarchies

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Video provides not only rich visual cues such as motion and appearance, but also much less explored long-range temporal interactions among objects. We aim to capture such interactions and to construct a powerful intermediate-level video representation for subsequent recognition. Motivated by this goal, we seek to obtain spatio-temporal oversegmentation of a video into regions that respect object boundaries and, at the same time, associate object pixels over many video frames. The contributions of this paper are two-fold. First, we develop an efficient spatiotemporal video segmentation algorithm, which naturally incorporates long-range motion cues from the past and future frames in the form of clusters of point tracks with coherent motion. Second, we devise a new track clustering cost function that includes occlusion reasoning, in the form of depth ordering constraints, as well as motion similarity along the tracks. We evaluate the proposed approach on a challenging set of video sequences of office scenes from feature length movies.