Disentangling factors of variation for facial expression recognition

Authors:
Salah Rifai;Yoshua Bengio;Aaron Courville;Pascal Vincent;Mehdi Mirza
Affiliations:
Department of Computer Science and Operations Research, Université de Montréal, Canada;Department of Computer Science and Operations Research, Université de Montréal, Canada;Department of Computer Science and Operations Research, Université de Montréal, Canada;Department of Computer Science and Operations Research, Université de Montréal, Canada;Department of Computer Science and Operations Research, Université de Montréal, Canada
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Year:
2012

Citing 10
Cited 2

Think globally, fit locally: unsupervised learning of low dimensional manifolds

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
EMPATH: A Neural Network that Categorizes Facial Expressions

Journal of Cognitive Neuroscience
Backpropagation applied to handwritten zip code recognition

Neural Computation
Classification using discriminative restricted Boltzmann machines

Proceedings of the 25th international conference on Machine learning
Deep learning via semi-supervised embedding

Proceedings of the 25th international conference on Machine learning
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
Convolutional learning of spatio-temporal features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
On deep generative models with applications to recognition

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Nobody likes Mondays: foreground detection and behavioral patterns analysis in complex urban scenes

Proceedings of the 4th ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream
Deep learning of representations: looking forward

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a semi-supervised approach to solve the task of emotion recognition in 2D face images using recent ideas in deep learning for handling the factors of variation present in data. An emotion classification algorithm should be both robust to (1) remaining variations due to the pose of the face in the image after centering and alignment, (2) the identity or morphology of the face. In order to achieve this invariance, we propose to learn a hierarchy of features in which we gradually filter the factors of variation arising from both (1) and (2). We address (1) by using a multi-scale contractive convolutional network (CCNET) in order to obtain invariance to translations of the facial traits in the image. Using the feature representation produced by the CCNET, we train a Contractive Discriminative Analysis (CDA) feature extractor, a novel variant of the Contractive Auto-Encoder (CAE), designed to learn a representation separating out the emotion-related factors from the others (which mostly capture the subject identity, and what is left of pose after the CCNET). This system beats the state-of-the-art on a recently proposed dataset for facial expression recognition, the Toronto Face Database, moving the state-of-art accuracy from 82.4% to 85.0%, while the CCNET and CDA improve accuracy of a standard CAE by 8%.