Toward adaptive conversational interfaces: Modeling speech convergence with animated personas

  • Authors:
  • Sharon Oviatt;Courtney Darves;Rachel Coulston

  • Affiliations:
  • Oregon Health and Science University, Beaverton, OR;Oregon Health and Science University, Beaverton, OR;Oregon Health and Science University, Beaverton, OR

  • Venue:
  • ACM Transactions on Computer-Human Interaction (TOCHI)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The design of robust interfaces that process conversational speech is a challenging research direction largely because users' spoken language is so variable. This research explored a new dimension of speaker stylistic variation by examining whether users' speech converges systematically with the text-to-speech (TTS) heard from a software partner. To pursue this question, a study was conducted in which twenty-four 7 to 10-year-old children conversed with animated partners that embodied different TTS voices. An analysis of children's amplitude, durational features, and dialogue response latencies confirmed that they spontaneously adapt several basic acoustic-prosodic features of their speech 10--50%, with the largest adaptations involving utterance pause structure and amplitude. Children's speech adaptations were relatively rapid, bidirectional, and dynamically readaptable when introduced to new partners, and generalized across different types of users and TTS voices. Adaptations also occurred consistently, with 70--95% of children converging with their partner's TTS, although individual differences in magnitude of adaptation were evident. In the design of future conversational systems, users' spontaneous convergence could be exploited to guide their speech within system processing bounds, thereby enhancing robustness. Adaptive system processing could yield further significant performance gains. The long-term goal of this research is the development of predictive models of human-computer communication to guide the design of new conversational interfaces.