Synthesizing multimodal utterances for conversational agents: Research Articles
Computer Animation and Virtual Worlds
Timing and entrainment of multimodal backchanneling behavior for an embodied conversational agent
Proceedings of the 15th ACM on International conference on multimodal interaction
Hi-index | 0.00 |
The goal of the present article is to introduce a new concept of a perception-production timing model in human-machine communication. The model implements a low-level cognitive timing and coordination mechanism. The basic element of the model is a dynamic oscillator capable of tracking reoccurring events in time. The organization of the oscillators in a network is being referred to as the Dynamic Perception-Production Oscillation Model (DPPOM). The DPPOM is largely based on findings in psychological and phonetic experiments on timing in speech perception and production. It consists of two sub-systems, a perception sub-system and a production sub-system. The perception sub-system accounts for information clustering in an input sequence of events. The production sub-system accounts for speech production rhythmically entrained to the input sequence. We propose a system architecture integrating both sub-systems, providing a flexible mechanism for perception-production timing in dialogues. The model's functionality was evaluated in two experiments.