Arm gesture variations during presentations are correlated with conjunctions indicating contrast

  • Authors:
  • John R. Zhang;John R. Kender

  • Affiliations:
  • Columbia University, New York, NY, USA;Columbia University, New York, NY, USA

  • Venue:
  • Proceedings of the 2012 ACM workshop on User experience in e-learning and augmented technologies in education
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Studies in linguistics and psychology have long observed correlations between gestures and content in speech. We explore an aspect of this phenomena within the framework of the automatic classification of upper body gestures. We demonstrate a correlation between the variances of natural arm motions and the presence of those conjunctions that are used to contrast connected clauses ("but", "neither", etc.). We examine educational lectures automatically, by first modeling speaker head-torso-arms and then extracting statistical features of their image flows. An AdaBoost-based binary classifier using decision trees as weak learners classifies videos according to whether its speech content contains such conjunctions. Our database of 3.83 hours of video is segmented into 4243 clips, each with subtitles; speakers are of different ethnicities and genders, discussing a variety of subject matter. We show that training on the set of all conjunctions produces a classifier that performs no better than chance, but that training on sets of conjunctions indicating contrast are capable of achieving 55% accuracy on a balanced test set. We speculate that such gestures are used to emphasize underlying semantic complexity, and that such classifiers can be used in presentation video browsers to locate semantically significant video segments.