Speaker-adaptive multimodal prediction model for listener responses

  • Authors:
  • Iwan de Kok;Dirk Heylen;Louis-Philippe Morency

  • Affiliations:
  • University of Twente, Enschede, Netherlands;University of Twente, Enschede, Netherlands;USC Institute for Creative Technologies, Los Angeles, USA

  • Venue:
  • Proceedings of the 15th ACM on International conference on multimodal interaction
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of this paper is to analyze and model the variability in speaking styles in dyadic interactions and build a predictive algorithm for listener responses that is able to adapt to these different styles. The end result of this research will be a virtual human able to automatically respond to a human speaker with proper listener responses (e.g., head nods). Our novel speaker-adaptive prediction model is created from a corpus of dyadic interactions where speaker variability is analyzed to identify a subset of prototypical speaker styles. During a live interaction our prediction model automatically identifies the closest prototypical speaker style and predicts listener responses based on this ``communicative style". Central to our approach is the idea of ``speaker profile" which uniquely identifies each speaker and enables the matching between prototypical speakers and new speakers. The paper shows the merits of our speaker-adaptive listener response prediction model by showing improvement over a state-of-the-art approach which does not adapt to the speaker. Besides the merits of speaker-adapta-tion, our experiments highlights the importance of using multimodal features when comparing speakers to select the closest prototypical speaker style.