Structured output ordinal regression for dynamic facial emotion intensity prediction

  • Authors:
  • Minyoung Kim;Vladimir Pavlovic

  • Affiliations:
  • Department of Computer Science, Rutgers University, Piscataway, NJ;Department of Computer Science, Rutgers University, Piscataway, NJ

  • Venue:
  • ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

We consider the task of labeling facial emotion intensities in videos, where the emotion intensities to be predicted have ordinal scales (e.g., low, medium, and high) that change in time. A significant challenge is that the rates of increase and decrease differ substantially across subjects. Moreover, the actual absolute differences of intensity values carry little information, with their relative order being more important. To solve the intensity prediction problem we propose a new dynamic ranking model that models the signal intensity at each time as a label on an ordinal scale and links the temporally proximal labels using dynamic smoothness constraints. This new model extends the successful static ordinal regression to a structured (dynamic) setting by using an analogy with Conditional Random Field (CRF) models in structured classification. We show that, although non-convex, the new model can be accurately learned using efficient gradient search. The predictions resulting from this dynamic ranking model show significant improvements over the regular CRFs, which fail to consider ordinal relationships between predicted labels. We also observe substantial improvements over static ranking models that do not exploit temporal dependencies of ordinal predictions. We demonstrate the benefits of our algorithm on the Cohn-Kanade dataset for the dynamic facial emotion intensity prediction problem and illustrate its performance in a controlled synthetic setting.