Gesture Cues for Conversational Interaction in Monocular Video

  • Authors:
  • Francis Quek;David McNeill;Rashid Ansari;Xin-Feng Ma;Robert Bryll;Susan Duncan;Karl E. McCullough

  • Affiliations:
  • -;-;-;-;-;-;-

  • Venue:
  • RATFG-RTS '99 Proceedings of the International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present our work on the determination of cues for discourse segmentation in free-form gesticulation accompanying speech in natural conversation. The basis for this integrating between gesticulation and speech discourse is the psycholinguistic concept of the co-equal generation of gesture and speech from the same semantic intent. We use the psycholinguistic device known as the `catchment' as the locus around which this integration proceeds. We videotape gesture and speech elicitation experiments in which a subject describes her living space to an interlocutor. We extract the gestural motion of both hands using the Vector Coherence Mapping algorithm that combines spatial, momentum and skin color constraints in parallel using a fuzzy image processing approach. We extract the voiced units in the discourse as F0 units are correlate these with transcribed speech. Psycholinguistics researchers perceptually micro-analyze the same video tape to produce a transcript that is annotated with the video timestamp and perceived gesture-speech entities. These serve to direct our high level analysis of the gesture trace and F0 data. We report the results of our analysis that show that the feature of `handedness' and the kind of symmetry in two-handed gestures provide effective cues for discourse segmentation. We also present observations on how the gesture traces provide cues to segment hand use, high level discourse repair, and super-segmental cues for discourse grouping.