On Affine Invariant Clustering and Automatic Cast Listing in Movies
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Robust Real-Time Face Detection
International Journal of Computer Vision
Tracking Multiple Humans in Complex Situations
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Computing Surveys (CSUR)
Robust Object Tracking by Hierarchical Association of Detection Responses
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
A stochastic graph evolution framework for robust multi-target tracking
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Robust visual tracking for multiple targets
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Person spotting: video shot retrieval for face sets
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Globally-optimal greedy algorithms for tracking a variable number of objects
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Learning affinities and dependencies for multi-target tracking using a CRF model
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
An online learned CRF model for multi-target tracking
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hi-index | 0.00 |
Automatic face association across unconstrained video frames has many practical applications. Recent advances in the area of object detection have made it possible to replace the traditional tracking-based association approaches with the more robust detection-based ones. However, it is still a very challenging task for real-world unconstrained videos, especially if the subjects are in a moving platform and at distances exceeding several tens of meters. In this paper, we present a novel solution based on a Conditional Random Field (CRF) framework. The CRF approach not only gives a probabilistic and systematic treatment of the problem, but also elegantly combines global and local features. When ambiguities in labels cannot be solved by using the face appearance alone, our method relies on multiple contextual features to provide further evidence for association. Our algorithm works in an on-line mode and is able to reliably handle real-world videos. Results of experiments using challenging video data and comparisons with other methods are provided to demonstrate the effectiveness of our method.