Fusing matching and biometric similarity measures for face diarization in video

Authors:
Elie Khoury;Paul Gay;Jean-Marc Odobez
Affiliations:
Idiap Research Institute, Martigny, Switzerland;LIUM, University of Maine, Le Mans, France;Idiap Research Institute, Martigny, Switzerland
Venue:
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Year:
2013

Citing 12
Cited 0

Fast features for face authentication under illumination direction changes

Pattern Recognition Letters
On the Use of SIFT Features for Face Authentication

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Taking the bite out of automated naming of characters in TV video

Image and Vision Computing
Visual language model for face clustering in consumer photos

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Face-and-clothing based people clustering in video content

Proceedings of the international conference on Multimedia information retrieval
Spatio-temporal tube kernel for actor retrieval

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Enhanced local texture feature sets for face recognition under difficult lighting conditions

IEEE Transactions on Image Processing
A GMM parts based face representation for improved verification through relevance adaptation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
User authentication via adapted statistical models of face images

IEEE Transactions on Signal Processing
“Knock! Knock! Who is it?” probabilistic person identification in TV-series

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Unsupervised metric learning for face identification in TV video

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses face diarization in videos, that is, deciding which face appears and when in the video. To achieve this face-track clustering task, we propose a hierarchical approach combining the strength of two complementary measures: (i) a pairwise matching similarity relying on local interest points allowing the accurate clustering of faces tracks captured in similar conditions, a situation typically found in temporally close shots of broadcast videos or in talk-shows; (ii) a biometric cross-likelihood ratio similarity measure relying on Gaussian Mixture Models (GMMs) modeling the distribution of densely sampled local features (Discrete Cosine Transform (DCT) coefficients), that better handle appearance variability. Experiments carried out on a public video dataset and on the data from the French REPERE challenge demonstrate the effectiveness of our approach in comparison with state-of-the-art methods.