Unsupervised metric learning for face identification in TV video

Authors:
Ramazan Gokberk Cinbis;Jakob Verbeek;Cordelia Schmid
Affiliations:
LEAR, INRIA Grenoble Laboratoire Jean Kuntzmann, France;LEAR, INRIA Grenoble Laboratoire Jean Kuntzmann, France;LEAR, INRIA Grenoble Laboratoire Jean Kuntzmann, France
Venue:
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Year:
2011

Citing 0
Cited 4

Facing scalability: Naming faces in an online social network

Pattern Recognition
Fusing matching and biometric similarity measures for face diarization in video

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Multimedia information seeking through search and hyperlinking

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Audiovisual diarization of people in video content

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of face identification is to decide whether two faces depict the same person or not. This paper addresses the identification problem for face-tracks that are automatically collected from uncontrolled TV video data. Face-track identification is an important component in systems that automatically label characters in TV series or movies based on subtitles and/or scripts: it enables effective transfer of the sparse text-based supervision to other faces. We show that, without manually labeling any examples, metric learning can be effectively used to address this problem. This is possible by using pairs of faces within a track as positive examples, while negative training examples can be generated from pairs of face tracks of different people that appear together in a video frame. In this manner we can learn a cast-specific metric, adapted to the people appearing in a particular video, without using any supervision. Identification performance can be further improved using semi-supervised learning where we also include labels for some of the face tracks. We show that our cast-specific metrics not only improve identification, but also recognition and clustering.