Fusing matching and biometric similarity measures for face diarization in video

  • Authors:
  • Elie Khoury;Paul Gay;Jean-Marc Odobez

  • Affiliations:
  • Idiap Research Institute, Martigny, Switzerland;LIUM, University of Maine, Le Mans, France;Idiap Research Institute, Martigny, Switzerland

  • Venue:
  • Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses face diarization in videos, that is, deciding which face appears and when in the video. To achieve this face-track clustering task, we propose a hierarchical approach combining the strength of two complementary measures: (i) a pairwise matching similarity relying on local interest points allowing the accurate clustering of faces tracks captured in similar conditions, a situation typically found in temporally close shots of broadcast videos or in talk-shows; (ii) a biometric cross-likelihood ratio similarity measure relying on Gaussian Mixture Models (GMMs) modeling the distribution of densely sampled local features (Discrete Cosine Transform (DCT) coefficients), that better handle appearance variability. Experiments carried out on a public video dataset and on the data from the French REPERE challenge demonstrate the effectiveness of our approach in comparison with state-of-the-art methods.