Multimodalcues for addressee-hood in triadic communication with a human information retrieval agent

Authors:
Jacques Terken;Irene Joris;Linda De Valk
Affiliations:
Technische Universiteit Eindhoven, Eindhoven, Netherlands;Technische Universiteit Eindhoven, Eindhoven, Netherlands;Technische Universiteit Eindhoven, Eindhoven, Netherlands
Venue:
Proceedings of the 9th international conference on Multimodal interfaces
Year:
2007

Citing 8
Cited 8

Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Head orientation and gaze direction in meetings

CHI '02 Extended Abstracts on Human Factors in Computing Systems
Video cut editing rule based on participants' gaze in multiparty conversation

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
A multi-modal approach for determining speaker location and focus

Proceedings of the 5th international conference on Multimodal interfaces
Identifying the addressee in human-human-robot interactions based on head pose and speech

Proceedings of the 6th international conference on Multimodal interfaces
Identifying the intended addressee in mixed human-human and human-computer interaction from non-verbal features

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Human perception of intended addressee during computer-assisted meetings

Proceedings of the 8th international conference on Multimodal interfaces

Designing Socially Aware Conversational Agents

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Conversational attitude-aware behavioral design for robot assistant combined with video communication

Proceedings of the 4th ACM International Workshop on Context-Awareness for Self-Managing Systems
Multimodal support for social dynamics in co-located meetings

Personal and Ubiquitous Computing
Identifying utterances addressed to an agent in multiparty human-agent conversations

IVA'11 Proceedings of the 10th international conference on Intelligent virtual agents
Making virtual conversational agent aware of the addressee of users' utterances in multi-user conversation using nonverbal information

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Using group history to identify character-directed utterances in multi-child interactions

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Addressee identification for human-human-agent multiparty conversations in different proxemics

Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction
Implementation and evaluation of a multimodal addressee identification mechanism for multiparty conversation systems

Proceedings of the 15th ACM on International conference on multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the last few years, a number of studies have dealt with the question of how the addressee of an utterance can be determined from observable behavioural features in the context of mixed human-human and human-computer interaction (e.g. in the case of someone talking alternatingly to a robot and another person). Often in these cases, the behaviour is strongly influenced by the difference in communicative ability of the robot and the other person, and the "salience" of the robot or system, turning it into a situational distractor. In the current paper, we study triadic human-human communication, where one of the participants plays the role of an information retrieval agent (such as in a travel agency where two customers who want to book a vacation, engage in a dialogue with the travel agent to specify constraints on preferable options). Through a perception experiment we investigate the role of audio and visual cues as markers of addressee-hood of utterances by customers. The outcomes show that both audio and visual cues provide specific types of information, and that combined audio-visual cues give the best performance. In addition, we conduct a detailed analysis of the eye gaze behaviour of the information retrieval agent both when listening and speaking, providing input for modelling the behaviour of an embodied conversational agent.