3D shape estimation in video sequences provides high precision evaluation of facial expressions

  • Authors:
  • LáSzló A. Jeni;AndráS Lrincz;TamáS Nagy;Zsolt Palotai;Judit Sebk;ZoltáN Szabó;DáNiel TakáCs

  • Affiliations:
  • University of Tokyo, Japan;Eötvös Loránd University, Hungary;Eötvös Loránd University, Hungary;Sparsense Inc., USA;Eötvös Loránd University, Hungary;Eötvös Loránd University, Hungary;Eötvös Loránd University, Hungary and Realeyes Data Services Ltd., UK

  • Venue:
  • Image and Vision Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Person independent and pose invariant estimations of facial expressions and action unit (AU) intensity estimation are important for situation analysis and for automated video annotation. We evaluated raw 2D shape data of the CK+ database, used Procrustes transformation and the multi-class SVM leave-one-out method for classification. We found close to 100% performance demonstrating the relevance and the strength of details of the shape. Precise 3D shape information was computed by means of constrained local models (CLM) on video sequences. Such sequences offer the opportunity to compute a time-averaged '3D personal mean shape' (PMS) from the estimated CLM shapes, which - upon subtraction - gives rise to person independent emotion estimation. On CK+ data PMS showed significant improvements over AU0 normalization; performance reached and sometimes surpassed state-of-the-art results on emotion classification and on AU intensity estimation. 3D PMS from 3D CLM offers pose invariant emotion estimation that we studied by rendering a 3D emotional database for different poses and different subjects from the BU 4DFE database. Frontal shapes derived from CLM fits of the 3D shape were evaluated. Results demonstrate that shape estimation alone can be used for robust, high quality pose invariant emotion classification and AU intensity estimation.