Evaluation of multimodal behaviour of embodied agents
From brows to trust
Visual contribution to speech perception: measuring the intelligibility of animated talking heads
EURASIP Journal on Audio, Speech, and Music Processing
On the importance of audiovisual coherence for the perceived quality of synthesized visual speech
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on animating virtual speakers or singers from audio: Lip-synching facial animation
Quality of talking heads in different interaction and media contexts
Speech Communication
Gaze, conversational agents and face-to-face communication
Speech Communication
Natural discourse reference generation reduces cognitive load in spoken systems
Natural Language Engineering
"Yours is better!": participant response bias in HCI
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
The dual task is a data-rich paradigm for evaluating speech modes of a synthetic talking head. Three experiments manipulated auditory-visual (AV) and auditory-only (A-only) speech produced by text-to-speech synthesis from a talking head (Experiment 1-single task; Experiment 2-dual task), and natural speech produced by a human male similar in appearance to the talking head (Experiment 3-dual task). In a dual task, participants perform two tasks concurrently with a secondary reaction time (RT) task sensitive to cognitive processing demands of the primary task. In the primary task, participants either shadowed words or named the superordinate categories to which words belonged under AV (dynamic face with lips moving) or A-only (static face) speech modes. First, it was hypothesized that category naming is more difficult than shadowing. The hypothesis was supported in each experiment with significantly longer latencies on the primary task and slower RT on the secondary task. Second, an AV advantage was hypothesized and supported by significantly shorter latencies for the AV modality on the primary task of Experiment 3 and with partial support in Experiment 1. Third, it was hypothesized that while the AV modality helps it also creates great cognitive load. Significantly longer RT for AV presentation in the secondary tasks supported this hypothesis. The results indicate that task difficulty influences speech perception. Performance on a secondary task can reveal cognitive demand that is not evident in a single task or self-report ratings. A dual task will be an effective evaluation tool in operational environments where multiple tasks are conducted (e.g., responding to spoken directions and monitoring displays) and an implicit, sensitive measure of cognitive load is imperative.