Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A Task-Based Evaluation of the TRAINS-95 Dialogue System
ECAI '96 Workshop on Dialogue Processing in Spoken Language Systems
Non-verbal cues for discourse structure
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Android as a telecommunication medium with a human-like presence
Proceedings of the ACM/IEEE international conference on Human-robot interaction
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
A filter-based approach to detect end-of-utterances from prosody in dialog systems
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
ACM Transactions on Applied Perception (TAP)
ZiF'06 Proceedings of the Embodied communication in humans and machines, 2nd ZiF research group international conference on Modeling communication with robots and virtual humans
How turn-taking strategies influence users' impressions of an agent
IVA'10 Proceedings of the 10th international conference on Intelligent virtual agents
Avatar and Dialog Turn-Yielding Phenomena
International Journal of Technology and Human Interaction
Predicting next speaker and timing from gaze transition patterns in multi-party meetings
Proceedings of the 15th ACM on International conference on multimodal interaction
Managing chaos: models of turn-taking in character-multichild interactions
Proceedings of the 15th ACM on International conference on multimodal interaction
Hi-index | 0.00 |
One of many skills required to engage properly in a conversation is to know the appropiate use of the rules of engagement. In order to engage properly in a conversation, a virtual human or robot should, for instance, be able to know when it is being addressed or when the speaker is about to hand over the turn. The paper presents a multimodal approach to end-of-speaker-turn prediction using sequential probabilistic models (Conditional Random Fields) to learn a model from observations of real-life multi-party meetings. Although the results are not as good as expected, we provide insight into which modalities are important when taking a multimodal approach to the problem based on literature and our own results.