Automatic inference of cross-modal nonverbal interactions in multiparty conversations: "who responds to whom, when, and how?" from gaze, head gestures, and utterances

  • Authors:
  • Kazuhiro Otsuka;Hiroshi Sawada;Junji Yamato

  • Affiliations:
  • NTT Communication Science Labs., Atsugi, Japan;NTT Communication Science Labs., Kyoto, Japan;NTT Communication Science Labs., Kyoto, Japan

  • Venue:
  • Proceedings of the 9th international conference on Multimodal interfaces
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel probabilistic framework is proposed for analyzing cross-modal nonverbal interactions in multiparty face-to-face conversations. The goal is to determine "who responds to whom, when, and how" from multimodal cues including gaze, head gestures, and utterances. We formulate this problem as the probabilistic inference of the causal relationship among participants' behaviors involving head gestures and utterances. To solve this problem, this paper proposes a hierarchical probabilistic model; the structures of interactions are probabilistically determined from high-level conversation regimes (such as monologue or dialogue) and gaze directions. Based on the model, the interaction structures, gaze, and conversation regimes, are simultaneously inferred from observed head motion and utterances, using a Markov chain Monte Carlo method. The head gestures, including nodding, shaking and tilt, are recognized with a novel Wavelet-based technique from magnetic sensor signals. The utterances are detected using data captured by lapel microphones. Experiments on four-person conversations confirm the effectiveness of the framework in discovering interactions such as question-and-answer and addressing behavior followed by back-channel responses.