Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Computer Vision
Evaluating multiple object tracking performance: the CLEAR MOT metrics
Journal on Image and Video Processing - Regular
Combined Top-Down/Bottom-Up Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition by Integrating Multiple Image Segmentations
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Robust Object Tracking by Hierarchical Association of Detection Responses
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Robust Real-Time Visual Tracking Using Pixel-Wise Posteriors
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Multi-person tracking with sparse detection and continuous segmentation
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Dense point trajectories by GPU-accelerated large displacement optical flow
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Object segmentation by long term analysis of point trajectories
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Detecting people using mutually consistent poselet activations
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to combine bottom-up and top-down segmentation
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Multiobject tracking as maximum weight independent set
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Track to the future: Spatio-temporal video segmentation with long-range motion cues
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Latent structured models for human pose estimation
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
User-assisted sparse stereo-video segmentation
Proceedings of the 10th European Conference on Visual Media Production
Hi-index | 0.00 |
We propose a tracking framework that mediates grouping cues from two levels of tracking granularities, detection tracklets and point trajectories, for segmenting objects in crowded scenes. Detection tracklets capture objects when they are mostly visible. They may be sparse in time, may miss partially occluded or deformed objects, or contain false positives. Point trajectories are dense in space and time. Their affinities integrate long range motion and 3D disparity information, useful for segmentation. Affinities may leak though across similarly moving objects, since they lack model knowledge. We establish one trajectory and one detection tracklet graph, encoding grouping affinities in each space and associations across. Two-granularity tracking is cast as simultaneous detection tracklet classification and clustering (cl2) in the joint space of tracklets and trajectories. We solve cl2 by explicitly mediating contradictory affinities in the two graphs: Detection tracklet classification modifies trajectory affinities to reflect object specific dis-associations. Non-accidental grouping alignment between detection tracklets and trajectory clusters boosts or rejects corresponding detection tracklets, changing accordingly their classification.We show our model can track objects through sparse, inaccurate detections and persistent partial occlusions. It adapts to the changing visibility masks of the targets, in contrast to detection based bounding box trackers, by effectively switching between the two granularities according to object occlusions, deformations and background clutter.