Predicting next speaker and timing from gaze transition patterns in multi-party meetings

Authors:
Ryo Ishii;Kazuhiro Otsuka;Shiro Kumano;Masafumi Matsuda;Junji Yamato
Affiliations:
NTT Corporation, Kanagawa, Japan;NTT Corporation, Kanagawa, Japan;NTT Corporation, Kanagawa, Japan;NTT Corporation, Kyoto, Japan;NTT Corporation, Kanagawa, Japan
Venue:
Proceedings of the 15th ACM on International conference on multimodal interaction
Year:
2013

Citing 4
Cited 0

Predicting Listener Backchannels: A Probabilistic Multimodal Approach

IVA '08 Proceedings of the 8th international conference on Intelligent Virtual Agents
Multimodal floor control shift detection

Proceedings of the 2009 international conference on Multimodal interfaces
Multimodal end-of-turn prediction in multi-party meetings

Proceedings of the 2009 international conference on Multimodal interfaces
A multimodal end-of-turn prediction model: learning from parasocial consensus sampling

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multi-party meetings, participants need to predict the end of the speaker's utterance and who will start speaking next, and to consider a strategy for good timing to speak next. Gaze behavior plays an important role for smooth turn-taking. This paper proposes a mathematical prediction model that features three processing steps to predict (I) whether turn-taking or turn-keeping will occur, (II) who will be the next speaker in turn-taking, and (III) the timing of the start of the next speaker's utterance. For the feature quantity of the model, we focused on gaze transition patterns near the end of utterance. We collected corpus data of multi party meetings and analyzed how the frequencies of appearance of gaze transition patterns differs depending on situations of (I), (II), and (III). On the basis of the analysis, we construct a probabilistic mathematical model that uses the frequencies of appearance of all participants' gaze transition patterns. The results of an evaluation of the model show the proposed models succeed with high precision compared to ones that do not take gaze transition patterns into account.