A probabilistic multimodal approach for predicting listener backchannels

Authors:
Louis-Philippe Morency;Iwan Kok;Jonathan Gratch
Affiliations:
Institute for Creative Technologies, University of Southern California, Marina del Rey, USA 90292;Human Media Interaction Group, University of Twente, Enschede, The Netherlands 7500AE;Institute for Creative Technologies, University of Southern California, Marina del Rey, USA 90292
Venue:
Autonomous Agents and Multi-Agent Systems
Year:
2010

Citing 11
Cited 18

BEAT: the Behavior Expression Animation Toolkit

Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
3-D Articulated Pose Tracking for Untethered Diectic Reference

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A shallow model of backchannel continuers in spoken dialogue

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Contextual recognition of head gestures

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Natural behavior of a listening agent

Lecture Notes in Computer Science
Does the contingency of agents' nonverbal feedback affect users' social anxiety?

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
SmartBody: behavior realization for embodied conversational agents

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Context-based recognition during human interactions: automatic feature selection and encoding dictionary

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
A spoken dialog system for chat-like conversations considering response timing

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Nonverbal behavior generator for embodied conversational agents

IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents

Classification of feedback expressions in multimodal data

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Differences in listener responses between procedural and narrative task

Proceedings of the 2nd international workshop on Social signal processing
Backchannel strategies for artificial listeners

IVA'10 Proceedings of the 10th international conference on Intelligent virtual agents
Multimodal backchannels for embodied conversational agents

IVA'10 Proceedings of the 10th international conference on Intelligent virtual agents
Learning and evaluating response prediction models using parallel listener consensus

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
The multiLis corpus - dealing with individual differences in nonverbal listening behavior

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Backchannels: quantity, type and timing matters

IVA'11 Proceedings of the 10th international conference on Intelligent virtual agents
When do we smile? analysis and modeling of the nonverbal context of listener smiles in conversation

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
Computational study of human communication dynamic

J-HGBU '11 Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding
Optimizing the turn-taking behavior of task-oriented spoken dialog systems

ACM Transactions on Speech and Language Processing (TSLP)
Developing multimodal web interfaces by encapsulating their content and functionality within a multimodal shell

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Estimating a user’s internal state before the first input utterance

Advances in Human-Computer Interaction
A regression-based approach to modeling addressee backchannels

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Integrating backchannel prediction models into embodied conversational agents

IVA'12 Proceedings of the 12th international conference on Intelligent Virtual Agents
Incremental dialogue understanding and feedback for multiparty, multimodal conversation

IVA'12 Proceedings of the 12th international conference on Intelligent Virtual Agents
Online behavior evaluation with the switching wizard of oz

IVA'12 Proceedings of the 12th international conference on Intelligent Virtual Agents
From nonverbal cues to perception: personality and social attractiveness

COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Speaker-adaptive multimodal prediction model for listener responses

Proceedings of the 15th ACM on International conference on multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g., Hidden Markov Model or Conditional Random Fields) can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The main challenges addressed in this paper are automatic selection of the relevant features and optimal feature representation for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically significant improvement over a previously published approach based on hand-crafted rules.