Gaze and turn-taking behavior in casual conversational interactions

  • Authors:
  • Kristiina Jokinen;Hirohisa Furukawa;Masafumi Nishida;Seiichi Yamamoto

  • Affiliations:
  • University of Helsinki, Finland;Doshisha University, Japan;Doshisha University, Japan;Doshisha University, Japan

  • Venue:
  • ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on interaction with smart objects, Special section on eye gaze and conversation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Eye gaze is an important means for controlling interaction and coordinating the participants' turns smoothly. We have studied how eye gaze correlates with spoken interaction and especially focused on the combined effect of the speech signal and gazing to predict turn taking possibilities. It is well known that mutual gaze is important in the coordination of turn taking in two-party dialogs, and in this article, we investigate whether this fact also holds for three-party conversations. In group interactions, it may be that different features are used for managing turn taking than in two-party dialogs. We collected casual conversational data and used an eye tracker to systematically observe a participant's gaze in the interactions. By studying the combined effect of speech and gaze on turn taking, we aimed to answer our main questions: How well can eye gaze help in predicting turn taking? What is the role of eye gaze when the speaker holds the turn? Is the role of eye gaze as important in three-party dialogs as in two-party dialogue? We used Support Vector Machines (SVMs) to classify turn taking events with respect to speech and gaze features, so as to estimate how well the features signal a change of the speaker or a continuation of the same speaker. The results confirm the earlier hypothesis that eye gaze significantly helps in predicting the partner's turn taking activity, and we also get supporting evidence for our hypothesis that the speaker is a prominent coordinator of the interaction space. Such a turn taking model could be used in interactive applications to improve the system's conversational performance.