Efficient illumination independent appearance-based face tracking

  • Authors:
  • José M. Buenaposada;Enrique Muñoz;Luis Baumela

  • Affiliations:
  • Departamento de Ciencias de la Computación, ETSI Informática, Universidad Rey Juan Carlos, Spain;Departamento de Inteligencia Artificial, Facultad Informática, Universidad Politécnica de Madrid, Spain;Departamento de Inteligencia Artificial, Facultad Informática, Universidad Politécnica de Madrid, Spain

  • Venue:
  • Image and Vision Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the major challenges that visual tracking algorithms face nowadays is being able to cope with changes in the appearance of the target during tracking. Linear subspace models have been extensively studied and are possibly the most popular way of modelling target appearance. We introduce a linear subspace representation in which the appearance of a face is represented by the addition of two approximately independent linear subspaces modelling facial expressions and illumination, respectively. This model is more compact than previous bilinear or multilinear approaches. The independence assumption notably simplifies system training. We only require two image sequences. One facial expression is subject to all possible illuminations in one sequence and the face adopts all facial expressions under one particular illumination in the other. This simple model enables us to train the system with no manual intervention. We also revisit the problem of efficiently fitting a linear subspace-based model to a target image and introduce an additive procedure for solving this problem. We prove that Matthews and Baker's inverse compositional approach makes a smoothness assumption on the subspace basis that is equivalent to Hager and Belhumeur's, which worsens convergence. Our approach differs from Hager and Belhumeur's additive and Matthews and Baker's compositional approaches in that we make no smoothness assumptions on the subspace basis. In the experiments conducted we show that the model introduced accurately represents the appearance variations caused by illumination changes and facial expressions. We also verify experimentally that our fitting procedure is more accurate and has better convergence rate than the other related approaches, albeit at the expense of a slight increase in computational cost. Our approach can be used for tracking a human face at standard video frame rates on an average personal computer.